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[0001] This application claims benefit under 35 U.S.C. § 1 19(e) of Provisional Patent 
Application Serial Number 60/225,554, filed August 16, 2000, incorporated herein by reference. 
[0002] This invention was made with government support under United States Department of 
Energy grant number DEFG 0294ER20138 and National Service Foundation grant number 
0095075IBN. The U.S. Government has certain rights in this invention. 

BACKGROUND 

[0003] The present invention is directed to methods to alter secondary metabolism of a 
plant, more specifically phenylpropanoid metabolism. The present invention is also directed 
to novel mutant polynucleotide molecules, referred to as ref8, that encode an Arabidopsis p- 
coumarate 3 -hydroxylase having altered biological activity. The present invention is also 
directed to uses of the novel nucleotide sequences set forth herein, including their use in 
vectors and other DNA constructs for transforming plants and microorganisms. The DNA 
constructs and transgenic plants are further aspects of the present invention . 
[0004] The publications, patents and other materials used herein to illuminate the 
background of the invention, and in particular cases, to provide additional details respecting 
the practice, are incorporated by reference, and for convenience are referenced in the 
following text by author and date and are listed alphabetically by author in the appended 
bibliography . 

[0005] By way of background, C3H is an enzyme of the phenylpropanoid pathway. 
Phenylpropanoid compounds have a wide array of important functions in plants. They serve 
in the interaction of plants with their biotic and abiotic environments, mediate certain aspects 
of plant growth and development, and are important structural components of the plant 
secondary cell wall. For example, stilbenes and isoflavones are important phytoalexins in 
plants (Nicholson and Hammerschmidt, 1992). In maize and petunia, flavonoids have been 
shown to be necessary for pollen viability (Coe et al., 1981; Taylor and Jorgensen, 1992; van 
der Meer et al., 1992), and have been suggested to be endogenous modulators of auxin 
transport (Mathesius et al. s 1998). Hydroxycinnamic acids lead to the synthesis of UV- 
sunscreens in plants (Landry et al., 1995), and are also precursors for lignin (Lewis and 
Yamomoto, 1990). Simpler phenylpropanoid-derived molecules such as acetosyringone act 
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as signaling molecules in the interaction of plants with Agrobacterium (Stachel et al., 1985). 
Lignan glycosides known as dihydrodiconiferyl glycosides (DCGs) have cytokinin-like 
activity in plants (Binns et al., 1987; Lynn et al., 1987; Teutonico et al., 1991; Orr and Lynn, 
1992), and may be responsible for growth abnormalities seen in some transgenic plants in 
which phenylpropanoid metabolism has been perturbed (Tamagnone et al., 1998). 
Phenylpropanoids are also increasingly being recognized as having an impact on human 
health. For example, isoflavones and lignans have beneficial estrogen-like activity in humans 
which is prompting their use as neutraceuticals (Bingham et al., 1998) and the stilbene 
resveratrol is thought to provide the health benefits associated with moderate wine 
consumption (Jang et al., 1997). All of the above examples make a compelling argument for 
improving our understanding of phenylpropanoid metabolism and its regulation. 
[0006] Advances in biotechnology have provided the tools with which to manipulate 
phenylpropanoid metabolism, and a number of cases have illustrated the potential value of 
this approach. The capacity to synthesize resveratrol has been transferred to tobacco by 
transformation with a construct encoding stilbene synthase (Hain et al., 1993). Flower 
pigmentation has been successfully manipulated in petunia by introduction of the maize gene 
encoding dihydroflavonol reductase (Meyer et al., 1987). Similarly, novel and valuable 
varieties of cut flowers are being generated by introduction of the gene encoding flavonoid 
3*,5'-hydroxylase which leads to the accumulation of blue trihydroxy-substituted 
anthocyanins (Holton et al., 1993). The manipulation of lignin biosynthesis has also been 
extensively investigated, with results ranging from substantial decreases in total lignin 
content to dramatic changes in lignin monomer composition (Meyer et al., 1998). As 
additional targets for the metabolic engineering of phenylpropanoid metabolism are 
investigated, their manipulations may lead to plants with enhanced nutritional value, crops 
that synthesize large amounts of secondary metabolites for industrial use, the modification of 
lignin quality and quantity in plants, and plants with enhanced UV tolerance. For these 
approaches to be successful, it is essential that we have a thorough knowledge of all of the 
catalysts involved. 

[0007] Most of the genes encoding the enzymes of the phenylpropanoid pathway have 
been cloned over the last ten years by standard biochemical approaches, and since their 
original isolation, an array of orthologues have been cloned from various species. These 
include the genes encoding caffeoyl CoA O-methyltransferase (CCoAOMT), cinnamate 
4-hydroxylase (C4H), cinnamyl alcohol dehydrogenase (CAD), cinnamoyl CoA reductase 
(CCR), 4-(hydroxy) cinnamoyl CoA ligase (4CL), phenylalanine ammonia-lyase (PAL), and 



caffeic acid/5-hydroxyferulic acid O-methyltransferase (COMT). The two cytochrome P450- 
dependent monooxygenases (P450s) in the pathway, C4H and ferulate 5-hydroxylase (F5H) 
were more difficult targets because the instability, low abundance, and membrane-bound 
nature of plant P450s makes conventional purification problematic. Despite these 
difficulties, the gene encoding C4H was recently identified (Mizutani et al., 1993b; Teutsch 
et al., 1993) following purification of the enzyme (Gabriac et al., 1991; Mizutani et al., 
1993a). Because the activity of F5H had been detected only once in plant extracts (Grand, 
1984), and because F5H proved unstable to purification, the detailed characterization of F5H 
was made possible only through the genetic analysis of the Arabidopsis fahl mutant (Chappie 
et al., 1992). Using this mutant, the gene encoding F5H was cloned by T-DNA tagging, an 
approach that circumvented the requirement of protein purification (Meyer et al., 1996). 
[0008] The biosynthesis of many phenylpropanoids requires two distinct hydroxylation 
steps. C4H introduces the first hydroxyl group at the 4-position of the aromatic ring of 
cinnamic acid. C4H activity is readily measured in plants, and was one of the first plant 
enzymes to be recognized to be a P450. The next hydroxylation occurs at the 3-position of 
the ring, and is necessary for the synthesis of many important phenylpropanoid compounds. 
In contrast to C4H, the 3-hydroxylase of the phenylpropanoid pathway has not been fully 
characterized. The enzyme that catalyzes this reaction is known as /7-coumarate 3- 
hydroxylase (C3H), although this hydroxylation may also be carried out at the CoA thioester 
level by /?-coumaroyl CoA 3-hydroxylase (pCCoA3H). It is not clear which of these two 
activities is relevant to phenylpropanoid metabolism because the 3-hydroxylase is an 
enigmatic enzyme. It has eluded attempts over the last thirty years to unambiguously 
characterize it in detail at the enzymatic level. It was the last gene of the phenylpropanoid 
pathway to be cloned. 

[0009] Over the past thirty years, many researchers have attempted to assay, 
characterize and purify C3H. C3H activity has been detected in extracts of spinach beet, 
sorghum, oak, mung bean, and potato (Vaughan and Butt, 1969; Vaughan and Butt, 1970; 
Alibert et al., 1972; Bartlett et al., 1972; Stafford and Dresler, 1972; Halliwell, 1975; Duke 
and Vaughn, 1982; Bolwell and Butt, 1983; Boniwell and Butt, 1986; Kojima and Takeuchi, 
1989). C3H has been characterized as a copper-containing mixed function oxidase (Vaughan 
and Butt, 1970) that requires an electron donor for activity. In most cases ascorbate has been 
found to be the optimal donor, although it is often required in very high concentration with 
K m values as high as 10 mM (Kojima and Takeuchi, 1989). NADPH and 2-amino-4- 
hydroxy-6,7-dimethylpteridine also served as a reductant in some cases (Vaughan and Butt, 



1970; Stafford and Dresler, 1972), whereas other enzyme preparations showed an absolute 
requirement for FAD or FMN (Boniwell and Butt, 1986). C3H has been reported to be 
associated with the chloroplast thylakoid membranes, where it was suggested that 
plastoquinone or ferredoxin could serve as the electron donor in vivo (Bartlett et al., 1972). 
[0010] In most experiments, C3H activity was associated with a phenolase activity 
which oxidizes dihydroxyphenols to their corresponding orthoquinones. In some cases, C3H 
activity could be purified away from phenolases, but generally the semi-purified C3H 
preparations retained substantial ability to oxidize dihydroxyphenols (Stafford and Dressier, 
1972). Still other experiments were aimed at correlating light- and wound-induced increases 
in PAL and C4H with induction of putative C3H activities (Bolwell and Butt, 1983). Once 
high background levels of phenolase were accounted for, some increases in C3H activity 
could be identified, and although the corresponding protein was partially purified it was not 
studied further. In experiments using mung bean seedlings treated with the fungal toxin 
tentoxin, phenolase activity was completely eliminated while the accumulation of caffeic acid 
in vivo and in vitro remained unaffected. These experiments provided strong evidence that 
distinguished C3H from phenolase (Duke and Vaughn, 1982). 

[0011] Other research has suggested that the 3 -hydroxy lation reaction occurs at the 
level of /?-coumaroyl esters such as /?-coumaroyl quinate, /?-coumaroyl shikimate, or p- 
coumaroyl glucose (Heller and Ktihnl, 1985; Kiihnl et al., 1987; Tanaka and Kojima, 1991). 
Based upon their association with membranes and classical inhibitor studies, the first two 
activities were attributed to P450s. The latter enzyme appeared to be closely related to the 
aforementioned phenolases and its involvement in phenylpropanoid biosynthesis has been 
viewed skeptically by some authors (Wang et al., 1997). 

[0012] Finally, another body of work suggests that 3 -hydroxy lation occurs at the level 
of the CoA thioester, and that the product of this reaction is used both as a primer for 
dihydroxylated anthocyanin biosynthesis, and as an acyl-donor. In Silene dioica, the P gene 
controls hydroxy lation of the 3* position of the anthocyanin B ring and the substitution pattern 
of the acyl-moiety esterified to the anthocyanin (Kamsteeg et al., 1980). Wild-type 
anthocyanins are caffeic acid esters of dihydroxy-substituted cyanidin glucosides, while 
homozygous p mutants accumulate monohydroxylated pelarogonidin glucosides that are 
esterified with />-coumaric acid. In this system, the /?CCoA3H activity was shown to be an 
NADPH-dependent monooxygenase, and this activity was shown to be absent in p/p petal 
extracts (Kamsteeg et al., 1981). The generality of these findings in relation to flavonoid 
synthesis is in doubt, however, since in other systems flavonoid hydroxylation occurs at the 



dihydroflavonol level and is catalyzed by specific P450s (Holton et al., 1993; Brugliera et al., 
1999). A Zn 2+ - and ascorbate-dependent /?CCoA3H has also been assayed in elicitor-induced 
cultures of parsley cells (Kneusel et al., 1989). The activity of this enzyme was shown to be 
highly sensitive to pH, and this was suggested to be a mechanism for enzyme activation in 
response to elicitation. While the nature and identity of /?CCoA3H remains questionable, the 
presence of CCoAOMT in plants (Pakusch et al., 1989; Schmitt et al., 1991; Ye et al., 1994; 
Ye and Varner, 1995), and the recent demonstration that its activity contributes substantially 
to lignin biosynthesis (Zhong et al., 1998) suggests that /?CCoA3H activity may be relevant to 
phenylpropanoid metabolism. 

[0013] The potential success or failure of metabolic engineering efforts hinge upon a 
thorough understanding of the target pathway. Similarly, the ability to interpret data from 
experiments that examine plant responses to pathogen or herbivore attack depends upon a 
comprehensive understanding of the metabolic framework that underlies those responses. 
One example that is particularly relevant to this proposal can be found in the recent rewriting 
of the phenylpropanoid pathway that has been the unexpected outcome of experiments aimed 
at the modification of lignin content and composition. 

[0014] The longstanding model of phenylpropanoid metabolism has postulated a 
branched but linear pathway (Higuchi, 1981). According to this model, the phenylpropane 
skeleton of phenylalanine is converted to hydroxycinnamic acids which serve as precursors 
for flavonoids, lignin and hydroxycinnamic acid esters. More recently, a different route for 
the biosynthesis of lignin monomers has received attention (Kneusel et al., 1989; Kuhnl et 
al., 1989; Pakusch et al., 1989; Pakusch et al., 1991; Schmitt etal., 1991; Ye et al., 1994; 
Ye and Varner, 1995; Zhong et al., 1998). This so-called "alternative pathway" involves the 
activation of /?-coumaric acid to its coenzyme A thioester, followed by hydroxylation and 
methylation reactions that ultimately generate feruloyl-CoA. Considering that ferulic acid 
can also be synthesized by the free acid pathway and can be activated to its CoA thioester by 
4CL, lignin monomer biosynthesis probably occurs via a cross-linked network of pathways. 
Indeed, the continued accumulation of guaiacyl lignin in COMT suppressed plants 
(Atanassova et al., 1995; Van Doorsselaere et al., 1995) indicates that the alternative pathway 
is a major contributor to lignin biosynthesis in woody plants. This hypothesis has been tested 
directly by the generation of transgenic tobacco downregulated for caffeoyl-CoA O- 
methytransferase (CCoAOMT) activity (Zhong et al., 1998). These plants had lower total 
lignin content, demonstrating that the alternative pathway is a quantitatively important route 



for monolignol biosynthesis and that COMT activity cannot compensate for a decrease in the 
expression of CCoAOMT. 

[0015] In addition to the incorporation of the "alternative pathway", data from the 
present research and that of others has necessitated a further revision of the lignin 
biosynthetic pathway (Humphreys, et al., 1999; Osakabe, et al., 1999). In these experiments, 
F5H expressed in yeast demonstrated Michaelis-Menten kinetics with regard to ferulate 
hydroxylation with a K m of 1 mM and a V max of 4 pKat mg* 1 protein. This K m was 
unexpectedly high considering that C4H, a P450 three steps earlier in the pathway, exhibits a 
4 jiM K m for its substrate (Urban et al., 1994). This inconsistency led us to test the 
hypothesis that phenylpropanoid pathway intermediates other than ferulate might be better 
substrates for F5H. Assays conducted with coniferaldehyde demonstrated that the Km and 
V max of F5H for this substrate were 1 \iM and 5 pKat mg" 1 respectively, and the 
corresponding values for coniferyl alcohol were 3 \iM and 6 pKat mg' 1 . These data strongly 
suggest that coniferaldehyde and coniferyl alcohol are the preferred substrates for F5H, and 
that F5H probably acts later in the pathway than was previously envisioned. Other 
experiments have also suggested that COMT is actually a 5 -hydroxy coniferyl alcohol/5- 
hydroxy coniferaldehyde 0-methyltransferase that acts immediately downstream of F5H in 
the lignin biosynthetic pathway (Humphreys et al., 1999; Osakabe et al., 1999; Li et al., 
2000). 

[0016] The experiments described above, among others, have demonstrated that 
understanding of phenylpropanoid metabolism is still incomplete. Although plant secondary 
metabolism has been studied for many decades, modern molecular, biochemical, and genetic 
investigations have led to substantial recent revisions in conventional thinking about how the 
products of this pathway are synthesized. The most notable remaining gap in knowledge of 
the phenylpropanoid pathway is C3H. 

[0017] Certain intermediates of phenylpropanoid pathway are precursors for lignin. In 
a parallel manner, in the last decade, our understanding of lignin biosynthesis has rapidly 
progressed. In many cases, the genetic manipulation of genes encoding enzymes of the 
conventional lignin pathway has generated unexpected results which have led the scientific 
community to re-evaluate lignin biosynthesis. The analysis of transgenics and mutants have 
demonstrated that genetically modified lignins may possess significant advantages over and 
above traditional raw materials currently used in the pulp and paper industry. In order to 
further "fine-tune" lignin profiles in economically important plant species in a rational 



manner, new biotechnological strategies must be employed. Thus, it is also desired to 
identify novel target genes in the biosynthesis of lignin by molecular and genetic approaches. 

SUMMARY OF THE INVENTION 
[0018] The present invention is directed to methods to alter secondary metabolism of a 
plant, more specifically phenylpropanoid metabolism. The present invention is also directed 
to novel mutant polynucleotide molecules, referred to as refS, that encode an Arabidopsis p- 
coumarate 3 -hydroxylase having altered biological activity. The present invention is also 
directed to uses of the novel nucleotide sequences set forth herein, including their use in 
vectors and other DNA constructs for transforming plants and microorganisms. The DNA 
constructs and transgenic plants are further aspects of the present invention. 
[0019] In one aspect of the invention, sense and antisense suppression methods or virus 
induced gene silencing methods for producing plants having altered phenylpropanoid 
metabolism are provided. In one embodiment of this aspect, quantitative lignin biosynthesis 
is altered. In a second embodiment, qualitative lignin biosynthesis is altered. In a third 
embodiment, flavonoid content is increased. In a fourth embodiment, isoflavonoid content is 
increased. In a fifth embodiment, anthocyanin content is increased. In a sixth embodiment, 
cell wall bound conjugates are decreased. 

[0020] In a second aspect of the invention, constructs comprising at least a portion of 
REF8 nucleic acid is provided for altering phenylpropanoid metabolism. The constructs 
generally comprise a heterologous promoter, i.e., one not naturally associated with the wild- 
type REF8 gene. The REF8 nucleic acid may be in the sense or antisense orientation with 
respect to the promoter. Vectors containing the construct for use in transforming plants or 
microorganisms are also provided. Any plant cells can be transformed in accordance with the 
present invention. Preferred plant cells are plant cells of woody plants. Preferred 
microorganisms are bacteria and yeast. 

[0021] In a third aspect of the invention, the DNA and protein sequences are provided 
for ref8 gene. 

[0022] In a fourth aspect of the invention, plants having at least one cell transformed 
with a construct containing REF8 nucleic acid for altering secondary metabolism are 
provided. Such plants have a phenotype characterized by altered secondary metabolism. 
Suitable plants may include but are not limited to alfalfa (Medicago sp.), rice (Oryza sp.) 9 
maize {Zea mays), oil seed rape {Brassica sp.), forage grasses, and also tree crops such as 



eucalyptus {Eucalyptus sp.), pine (Pinus sp.), spruce (Picea sp.) and poplar {Populus sp.), as 
well as Arabidopsis sp. and tobacco (Nicotiana sp.). 

[0023] In a fifth aspect of the invention, methods are provided for the production of 
mutant c3h in host cells. Preferred host cells are bacteria and yeast. 



DESCRIPTION OF THE FIGURES 
[0024] Figures 1 A-C show HPLC analysis of soluble secondary metabolites produj 
by wild-type and re/8 plants. (A) Compounds found in wild-type and re/8 le^yesrwere 
tracted with methanol and analyzed by HPLC. The elutioiijjf-UV^absorbing compounds 
as monitored at 320 nm. (B) Hydroxycinn^mie^ci3s released from their ester conjugates 

Troom temperature) of the methanolic extract of (A). (C) 
The same analyses^8-(^) performed on wild-type and refS seed extracts. Sinmal, 
sinapoylgjaiate; Single, sinapoylglucose; Sincho, sinapoylcholine. 
[0025] Figures 2A-B show analysis of phenylpropanoid synthesis in wild-type and re/8 
leaves using radiotracer feeding technology. 14 C-L-Phenylalanine was administered to 
individual illuminated leaves of wild-type and mutant plants for 3h before extraction in 60°C 
MeOH in the presence of unlabeled hydroxycinnamic acids. A) Two dimensional silica-gel 
TLC (solvent 1, petroleum ether/ethyl acetate/methanol/acetic acid 10:10:1:0.2; solvent 2, 
toluene/acetic acid/water 2: 1 :sat.) of the methanolic extract followed by autoradiography to 
identify the radiolabeled metabolites synthesized. B) Similar analyses performed after the 
methanolic extract from the previous experiment was saponified to release ester-bound 
phenolic acids C, cinnamic acid; /?C,/?-coumaric acid; F, ferulic acid; S, sinapic acid. 
[0026] Figures 3 A-C show analysis of phenylpropanoid synthesis in A) a standarcJ^B) 
wild-type, and C) re/8 leaves using radiotracer feeding technology. Radiotracgplfeeding and 
metabolite extraction was performed as in Figure 2 except that extragts-^^ere dried and 
methanolyzed by incubation in methanolic HCL (80°C, 11^-Pfoducts were extracted into 
diethyl ether and methylcaffeate was purified bj^eifn^preparative silica gel TLC (solvent 1, 
;ene/dioxane/acetic acid 90:10:l^§plv6nt 2, toluene/acetic acid/water 2:1 sat), followed 
y reverse phase HPLC. TTje*c6ntinuous line represents the UV-ab sorption of the HPLC 
column eluate. ThpJlistogram represents radioactivity associated with each fraction. Caf, 
caffeic ackJ^Coum, /?-coumaric acid; MeCaf, methylcaffeate; MeCoum, methyl p- 
ccmmSrate. 

[0027] Figures 4A-B show HPLC separation of monomeric phenolic products liberated 
upon alkaline hydrolysis from cell walls of Arabidopsis wild-type, re/8 and fahl plants. Cell 
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walls were prepared from rachis tissue of 6 week old plants by successive extraction with 
neutral phosphate buffer, ethanol and acetone and subjected to alkaline hydrolysis (1M 
NaOH, 16h, room temperature). Released phenolics were extracted in ethyl acetate and 
separated by reverse phase HPLC using detection at 320 (A) or 275 (B) nm./?Coum c , cis-p- 
coumaric acid; /?Coum t , trans-p-couvnaxiQ, acid; fer c , cis-ferulic acid; Fer t) fraws-ferulic acid; 
Van, vanillin; Sad, syringealdehyde; /?OHB,/?-hydroxybenzaldehyde. 
[0028] Figures 5A-C show analysis of lignin monomer composition of wild-type^Rtf 
re/8. (A) Nitrobenzene oxidation products analyzed by HPLC, IS, intemalsJandSfd (3- 
ethoxy-4-hydroxybenzaldehyde); /?OHBen, p-hydroxybenzaldehydefSyr, syringealdehyde; 
Van, vanillin, VanA, vanillic acid. (B) DFRC producjs-aflalyzed by GC-MS using selective 
rt>n monitoring. CAD, coniferyl alcohoMiaeefate; pC AD, /?-coumaryl alcohol diacetate; IS, 
./internal standard (4,4,-ethylid^»e ; 6i^henol), SAD, sinapyl alcohol diacetate. (C) Lignin 
V analysis by Py-C$i-MSTGi-Gi2, guaiacyl lignin pyrolysis products; H1-H5, p-hydroxyphenyl 
ligninpyrtSlysis products; IS, internal standard (1,3,5-tri-f-butylbenzene); S1-S3 syringyl 
Hgnin pyrolysis products. 

[0029] Figure 6 shows expression of active C3H in yeast. Yeast canyingjb^vtnc 
YeDP60 vector or the YeDP60-C3H vector were cultured andjnduC^as described 
^-previously (Humphreys, et al., 1999) in medi§^supplemented with /7-coumaric acid. At the 
' / end of the incubation periodjlje-medium was extracted with ethylacetate, and analyzed by 
HPLC. Spectj£jol4fie*caffeic acid standard, and the co-chromatographing novel peak found 
in therinedium of the yeast carrying YeDP60-C3H are shown at right. 
[0030] Figure 7 shows map-based cloning of the REF8 gene. A narrow mapping 
interval was determined for the REF8 gene containing three genes considered to be 
candidates for REF8. The P450 (underlined) encoded on BAC T20B5 (T20B5.9) was 
selected as the most likely candidate for REF8 based upon its position within the mapping 
interval relative to the frequency of recombinant chromosomes identified within the mapping 
population at the flanking markers. Near the heme binding motif of T20B5.9, the ref8 genes 
contains a single G to A transition mutation that results in a G444A substitution in the amino 
acid sequence. 

[0031] Figure 8A-B show heterologous expression of REF8. (A) SDS-PAGE analysis 
of microsomal proteins from E. coli harboring pBOV (lane 1), pBO W-REF2 (lane 2), and 
pBOV-re/5 (lane 3). (B) Carbon monoxide difference spectra of dithionite-reduced E. coli 
membranes. Dashed line, baseline prior to CO treatment; solid line, CO difference spectrum 
of REF8-containing membranes; dotted line, CO difference spectrum of membranes from E. 
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coli expressing the protein encoded by the re/8 allele. (C) SDS-PAGE analysis of membrane 
protein-enriched Triton X-l 14 detergent phase of yeast microsomal proteins harboring 
pYeDP60 (lane 1), pYcDV60-REF8 (lane 2), and pYeDP60-re/8 (lane 3). (D) Carbon 
monoxide difference spectrun of dithionite-reduced Triton X-l 14 detergent phase from 
microsomes prepared from yeast harboring pYeDP60- J ft£FS. Dashed line, baseline prior to 
CO treatment; solid line, difference spectrum after CO treatment. 

[0032] Figure 9 shows kinetic analysis of C3H-catalyzed substrate 3-hydro2Cj4ati5iT 
(A) Comparison of relative activity of C3H using /?-couroaraterP=eour^^ p- 
"^oumaryl alcohol, and p-oQwc^x^X^mS^SyT^tx at a concentration of 1 mM. (B) 
l.ineweaver-^jtffce*pIot of />-coumaroyl methyl ester 3-hydroxylation. For both graphs, error 
ba*8*fepresent one standard deviation for triplicate assays. 

SUMMARY OF THE SEQUENCES 
[0033] SEQ ID NO: 1 is the nucleotide sequence for the coding region of the wild-type 
C3H gene from Arabidopsis. 

[0034] SEQ ID NO:2 is the nucleotide sequence of the refS gene. 

[0035] SEQ ID NO: 3 is the nucleotide sequence of the gene used for expression in 

yeast. 

[0036] SEQ ID NO:4 is the amino acid sequence encoded by SEQ ID NO: 1 . 

[0037] SEQ ID NO:5 is the amino acid sequence encoded by SEQ ID NO:2. 
[0038] SEQ ID NO:6 is the amino acid sequence of the catalytic domain of SEQ ID 
NO:4. 

[0039] SEQ ID NO: 7 is the amino acid sequence of the catalytic domain of SEQ ID 
NO:5. 



DETAILED DESCRIPTION OF THE INVENTION 
[0040] The present invention is directed to methods to alter secondary metabolism of a 
plant, more specifically phenylpropanoid metabolism. The present invention is also directed 
to novel mutant polynucleotide molecules, referred to as re/8, that encode an Arabodopsis p- 
coumarate 3 -hydroxylase having altered biological activity. The present invention is also 
directed to uses of the novel nucleotide sequences set forth herein, including their use in 
vectors and other DNA constructs for transforming plants and microorganisms. The DNA 
constructs and transgenic plants are further aspects of the present invention. 
[0041] Advances in biotechnology have provided the tools with which to manipulate 
phenylpropanoid metabolism, and a number of cases have illustrated the potential value of 
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this approach. Most of the genes encoding the enzymes of the phenylpropanoid pathway 
have been cloned over the last three years and, since their original isolation, an array of 
orthologues have been cloned from various species. Despite the fact that much is known of 
all other enzymes in the phenylpropanoid pathway, C3H has not previously been 
characterized. Knowledge of the activity of C3H disclosed herein enables alteration of the 
phenylpropanoid pathway. Some uses of the activity of C3H polypeptides are described in 
more detail below. 

Definitions 

[0042] The present invention employs the following definitions: 
[0043] "C3H" or "p-coumarate 3-hydroxylase" refers to an enzyme in the 

phenylpropanoid biosynthetic pathway which complements the refS mutant when expressed 
in Arabidopsis. 

[0044] "c3h" or "mutant C3H" each refer to the polypeptide encoded by the re/8 

gene. 

[0045] "refS gene" refers to the polynucleotide which is complemented by the wild- 

type C3H when expressed in Arabidopsis. The re/8 gene has the nucleotide sequence set 
forth in SEQ ID NO:2. 

[0046] "Altered Lignin Content" or "modified lignin content" refers to the 

modification of total lignin content as measured by the method described herein or other 
methods known in the art. 

[0047] "Altered lignin content" or "modified lignin content" refers to modification 

of the lignin monomer composition compared to the parent plant from which the plant having 
the modified phenotype is obtained. Alterations correlated with suppression of C3H 
polypeptide activity may include increases in percent H lignin monomers. Such modified 
lignin content can be uniform throughout the plant and typically arise when each of the cells 
within the plant contain cells transformed with a vector comprising at least a portion of the 
REF8 nucleic acid. Such plants are sometimes referred to as transgenic plants. The 
phenotype produced in a particular plant is dependent upon the design of the vector used to 
produce it. Thus, the vector can be designed to transcribe a nucleic acid which encodes at 
least a portion of the C3H protein. In such cases, the C3H protein so produced is capable of 
conferring a particular phenotype based on the presence of that protein within the cell. 
Alternatively, the vector can be constructed such that transcription results in the formation of 
a transcript which is capable of hybridizing with an RNA transcript of an endogenous C3H 
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homolog gene. This approach employs the well known antisense technology and results in a 
modulation in the phenotypic effect of the endogenous REF8 genes. Such modulation of the 
endogenous REF8 gene can also potentially be obtained by using the sense strand of the 
REF8 gene to cause sense suppression of the endogenous REF8 alleles as well as the REF8 
gene introduced in the vector. The production of a plant containing such a phenotype is 
contemplated based upon the sense suppression observed in Petunia hybrida as set forth in 
PCT Publication WO 90/12084. 

[0048] "Encode." A polynucleotide is said to "encode" a polypeptide if, in its native 

state or when manipulated by methods well known to those skilled in the art, it can be 
transcribed and/or translated to produce the RNA for and/or the polypeptide or a fragment 
thereof. The anti-sense strand is the complement of such a nucleic acid and the encoding 
sequence can be deduced there from. 

[0049] "Operably linked" refers to a juxtaposition wherein the components so 

described are in a relationship permitting them to function in their intended manner, i.e., a 
promoter is operably linked to a coding sequence if the promoter affects its transcription or 
expression. 

[0050] As used herein, a "portion" of the REF8 locus or region or allele is defined as 
having a minimal size of at least about eight nucleotides, or preferably about 1 5 nucleotides, 
or more preferably at least about 25 nucleotides, and may have a minimal size of at least 
about 40 nucleotides. This definition includes all sizes in the range of 8-40 nucleotides as 
well as greater than 40 nucleotides. Thus, this definition includes nucleic acids of 8, 12, 15, 
20, 25, 40, 60, 80, 100, 200, 300, 400, 500 nucleotides, or nucleic acids having any number 
of nucleotides within these ranges of values (e.g., 9, 10, 11, 16, 23, 30, 38, 50, 72, 121, etc., 
nucleotides), or nucleic acids having at least 8 nucleotides derived from SEQ ID NO: 1 or 
SEQ ID NO: 2, their complement or functionally equivalent nucleic acid sequences. 
[0051] "Recombinant nucleic acid" is a nucleic acid which is not naturally occurring, 

or which is made by the artificial combination of two otherwise separated segments of 
sequence. This artificial combination is often accomplished by either chemical synthesis 
means, or by the artificial manipulation of isolated segments of nucleic acids, e.g., by genetic 
engineering techniques. Such is usually done to replace a codon with a redundant codon 
encoding the same or a conservative amino acid, while typically introducing or removing a 
sequence recognition site. Alternatively, it is performed to join together nucleic acid 
segments of desired functions to generate a desired combination of functions. 
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[0052] "REF8 allele" refers, respectively, to normal alleles of the REF8 locus as well 

as alleles of REF8 having variations, isolated from plants or produced in accordance with the 
present invention. 

[0053] "REF8 locus", REF8 gene", "REF8 nucleic acids" or "REF8 

polynucleotide" each refer to polynucleotides, all of which are in the REF8 region, 
respectively, that are likely to be expressed in normal tissue and involved in phenylpropanoid 
metabolism. The REF8 locus is intended to include coding sequences, intervening sequences 
and regulatory elements (e.g., promoters and enhancers) controlling transcription and/or 
translation. The REF8 locus is intended to include all allelic variations of the DNA sequence. 
[0054] These terms, when applied to a nucleic acid, refer to a nucleic acid which 
encodes a C3H polypeptide, fragment, homolog or variant, including, e.g., protein fusions or 
deletions. The nucleic acids of the present invention will possess a sequence which is either 
derived from, or substantially similar to, a natural REF8-encoding gene or one having 
substantial homology with a natural REF8-encoding gene or a portion thereof. The term 
REF8 nucleic acid is sometimes used to refer to the sense and antisense strands of the REF8 
gene collectively. 

[0055] The REF8 gene or nucleic acid includes normal alleles of the REF8 gene, 
respectively, including silent alleles having no effect on the amino acid sequence of the C3H 
polypeptide that do not substantially affect its function. These terms also include alleles 
having one or more mutations which adversely affect the function of the C3H polypeptide. A 
mutation may be a change in the REF8 nucleic acid sequence which produces a deleterious 
change in the amino acid sequence of the C3H polypeptide, resulting in partial or complete 
loss of C3H protein function, such as ref8, or may be a change in the nucleic acid sequence 
which results in the loss of effective C3H protein expression or the production of aberrant 
forms of the C3H polypeptide. 

[0056] The REF8 nucleic acid may be that shown in SEQ ID NO: 1 or SEQ ID NO: 2 
or it may be an allele as described above or a variant or derivative differing from that shown 
by a change which is one or more of addition, insertion, deletion and substitution of one or 
more nucleotides of the sequence shown. Changes to the nucleotide sequence may result in 
an amino acid change at the protein level, or not, as determined by the genetic code. 
[0057] Thus, nucleic acid according to the present invention may include a sequence 
different from the sequence shown in SEQ ID No: 1 or SEQ ID NO: 2 yet encode a 
polypeptide with the same amino acid sequence as shown in SEQ ID NO: 4 or SEQ ID NO:5. 
That is, nucleic acids of the present invention include sequences which are degenerate as a 



13 



result of the genetic code. On the other hand, the encoded polypeptide may comprise an 
amino acid sequence which differs by one or more amino acid residues from the amino acid 
sequence shown in SEQ ID NO: 4, such as that of SEQ ID NO: 5. Nucleic acid encoding a 
polypeptide which is an amino acid sequence variant, derivative or allele of the amino acid 
sequence shown in SEQ ID NO: 4 or SEQ ID NO: 5 is also provided by the present invention. 
[0058] The REF8 gene, respectively, also refers to (a) any DNA sequence that (i) 
hybridizes to the complement of the DNA sequences that encode the amino acid sequence set 
forth in SEQ ID NO: 4 or SEQ ID NO:5 under less stringent conditions, such as moderately 
stringent conditions (Ausubel et al. (1992)) and (ii) encodes a gene product functionally 
equivalent to REF8. The invention also includes nucleic acid molecules that are the 
complements of the sequences described herein. 

[0059] The polynucleotide compositions of this invention include RNA, cDNA, 
genomic DNA, synthetic forms, and mixed polymers, both sense and antisense strands, and 
may be chemically or biochemically modified or may contain non-natural or derivatized 
nucleotide bases, as will be readily appreciated by those skilled in the art. Such 
modifications include, for example, labels, methylation, substitution of one or more of the 
naturally occurring nucleotides with an analog, internucleotide modifications such as 
uncharged linkages (e.g., methyl phosphonates, phosphotriesters, phosphoramidates, 
carbamates, etc.), charged linkages (e.g., phosphorothioates, phosphorodithioates, etc.), 
pendent moieties (e.g., polypeptides), intercalators (e.g., acridine, psoralen, etc.), chelators, 
alkylators, and modified linkages (e.g., alpha anomeric nucleic acids, etc.). Also included are 
synthetic molecules that mimic polynucleotides in their ability to bind to a designated 
sequence via hydrogen bonding and other chemical interactions. Such molecules are known 
in the art and include, for example, those in which peptide linkages substitute to phosphate 
linkages in the backbone of the molecule. 

[0060] The present invention provides recombinant nucleic acids comprising all or part 
of the REF8 region. The recombinant construct may be capable of replicating autonomously 
in a host cell. Alternatively, the recombinant construct may become integrated into the 
chromosomal DNA of the host cell. Such a recombinant polynucleotide comprises a 
polynucleotide of genomic, cDNA, semi-synthetic, or synthetic origin which, by virtue of its 
origin or manipulation, 1) is not associated with all or a portion of a polynucleotide with 
which it is associated in nature; 2) is linked to a polynucleotide other than that to which it is 
linked in nature; or 3) does not occur in nature. Where nucleic acid according to the 
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invention included RNA, reference to the sequence shown should be construed as reference 
to the RNA equivalent, with U substituted for T. 

[0061] Therefore, recombinant nucleic acids comprising sequences otherwise not 
naturally occurring are provided by this invention. Although the wild-type sequence may be 
employed, it may also be altered, e.g., by deletion, substitution or insertion. cDNA or 
genomic libraries of various types may be screened as natural sources of the nucleic acids of 
the present invention, or such nucleic acids may be provided by amplification of sequences 
resident in genomic DNA or other natural sources, e.g., by PCR. The choice of cDNA 
libraries normally corresponds to a tissue source which is abundant in mRNA for the desired 
proteins. Phage libraries are normally preferred, but other types of libraries may be used. 
Clones of a library are spread onto plates, transferred to a substrate for screening, denatured 
and probed for the presence of desired sequences. 

[0062] The DNA sequences used in this invention will usually comprise at least about 
five codons (15 nucleotides), more usually at least about 7-15 codons, and most preferably, at 
least about 35 codons. One or more introns may also be present. This number of nucleotides 
is usually about the minimal length required for a successful probe that would hybridize 
specifically with an REF8 encoding sequence. In this context, oligomers of as low as 8 
nucleotides, more generally 8-17 nucleotides, can be used for probes, especially in 
connection with chip technology. 

[0063] Techniques for nucleic acid manipulation are described generally, e.g., in 
Sambrook et al. (1989) or Ausubel et al. (1992). Reagents useful in applying such 
techniques, such as restriction enzymes and the like, are widely known in the art and 
commercially available from such vendors as New England BioLabs, Boehringer Mannheim, 
Amersham, Promega, U. S. Biochemicals, New England Nuclear, and a number of other 
sources. The recombinant nucleic acid sequences used to produce fusion proteins of the 
present invention may be derived from natural or synthetic sequences. Many natural gene 
sequences are obtainable from various cDNA or from genomic libraries using appropriate 
probes. See, GenBank, National Institutes of Health. 

[0064] "C3H protein" and "C3H polypeptide" refers to a protein or polypeptide 

encoded by the REF8 locus, variants or fragments thereof. The term "polypeptide" refers to a 
polymer of amino acids and its equivalent and does not refer to a specific length of the 
product; thus, peptides, oligopeptides and proteins are included within the definition of a 
polypeptide. This term also does not refer to, or exclude modifications of the polypeptide, for 
example, glycosylations, acetylations, phosphorylations, and the like. Included within the 
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definition are, for example, polypeptides containing one or more analogs of an amino acid 
(including, for example unnatural amino acids, etc.), polypeptides with substituted linkages 
as well as other modifications known in the art, both naturally and non-naturally occurring. 
Ordinarily, such polypeptides will be at least about 50% homologous to the native REF8 
nucleic acid sequence, preferably in excess of about 90%, and more preferably at least about 
95% homologous. Also included are proteins encoded by DNA which hybridize under high 
or low stringency conditions, to REF8-encoding nucleic acids and closely related 
polypeptides or proteins retrieved by antisera to the C3H protein(s). 
[0065] The C3H polypeptide may be that shown in SEQ ID NO: 4 or SEQ ID NO:5 
which may be in isolated and/or purified form, free or substantially free of material with 
which it is naturally associated. The polypeptide may, if produced by expression in a 
prokaryotic cell or produced synthetically, lack native post-translational processing, such as 
glycosylation. Alternatively, the present invention is also directed to polypeptides which are 
sequence variants, alleles or derivatives of the C3H polypeptide. Such polypeptides may 
have an amino acid sequence which differs from that set forth in SEQ ID NO: 4 or SEQ ID 
NO: 5 by one or more of addition, substitution, deletion or insertion of one or more amino 
acids. In one embodiment, these variant polypeptides have a function similar to C3H protein. 
In a second embodiment, these variant peptides do not retain the C3H protein function such 
that they can be used as a dominant negative. 

[0066] Substitutional variants typically contain the exchange of one amino acid for 
another at one or more sites within the protein, and may be designed to modulate one or more 
properties of the polypeptide, such as stability against proteolytic cleavage, without the loss 
of other functions or properties. Amino acid substitutions may be made on the basis of 
similarity in polarity, charge, solubility, hydrophobicity, hydrophilicity, and/or the 
amphipathic nature of the residues involved. Preferred substitutions are ones which are 
conservative, that is, one amino acid is replaced with one of similar shape and charge. 
Conservative substitutions are well known in the art and typically include substitutions within 
the following groups: glycine, alanine; valine, isoleucine, leucine; aspartic acid, glutamic 
acid; asparagine, glutamine; serine, threonine; lysine, arginine; and tyrosine, phenylalanine. 
[0067] Certain amino acids may be substituted for other amino acids in a protein 
structure without appreciable loss of interactive binding capacity with structures such as, for 
example, antigen-binding regions of antibodies or binding sites on substrate molecules or 
binding sites on proteins interacting with the C3H polypeptide. Since it is the interactive 
capacity and nature of a protein which defines that protein's biological functional activity, 
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certain amino acid substitutions can be made in a protein sequence, and its underlying DNA 
coding sequence, and nevertheless obtain a protein with like properties. In making such 
changes, the hydropathic index of amino acids may be considered. The importance of the 
hydrophobic amino acid index in conferring interactive biological function on a protein is 
generally understood in the art (U.S. Patent 4,554,101). The use of the hydrophobic index or 
hydrophilicity in designing polypeptides is further discussed in U.S. Patent 5,691,198. 
[0068] The length of polypeptide sequences compared for homology will generally be 
at least about 16 amino acids, usually at least about 20 residues, more usually at least about 
24 residues, typically at least about 28 residues, and preferably more than about 35 residues. 
[0069] "VIGS" or "virus induced gene silencing" refers to suppression of gene 

expression in plants in a sequence-specific manner by infection with virus vectors carrying 
fragments of host genes. The mechanism of this gene silencing is based on an RNA- 
mediated defense against viruses (Baulcombe, 1999). It has also emerged that a related 
mechanism is involved in the post-transcriptional silencing that accounts for between line 
variation in transgene expression and co suppression of transgenes and endogenous genes. 
[0070] "Regulatory sequences" refers to those sequences which affect the expression 

of the gene (including transcription of the gene, and translation, splicing, stability or the like 
of the messenger RNA and tissue specificity). These sequences are normally within 100 kb 
of the coding region of a locus, although they may also be more distant from the coding 
region, or they may be located within coding regions of the gene. 

[0071] As used herein, an "isolated nucleic acid molecule" is a polymer of RNA or 

DNA that is single- or double-stranded, optionally containing synthetic, non-natural or 
altered nucleotide bases. An isolated nucleic acid molecule in the form of a polymer of DNA 
may be comprised of one or more segments of cDNA, genomic DNA or synthetic DNA. 
[0072] A nucleic acid or fragment thereof has substantial identity with another if, 

when optimally aligned (with appropriate nucleotide insertions or deletions) with the other 
nucleic acid (or its complementary strand), there is nucleotide sequence identity in at least 
about 60% of the nucleotide bases, usually at least about 70%, more usually at least about 
80%, preferably at least about 90%, and more preferably at least about 95-98% of the 
nucleotide bases. A protein or fragment thereof has substantial identity with another if, 
optimally aligned, there is an amino acid sequence identity of at least about 30% identity with 
an entire naturally-occurring protein or a portion thereof, usually at least about 70% identity, 
more usually at least about 80% identity, preferably at least about 90% identity, and more 
preferably at least about 95% identity. 
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[0073] Identity means the degree of sequence relatedness between two polypeptide or 

two polynucleotides sequences as determined by the identity of the match between two 
strings of such sequences, such as the full and complete sequence. Identity can be readily 
calculated. While there exist a number of methods to measure identity between two 
polynucleotide or polypeptide sequences, the term "identity" is well known to skilled artisans 
(Lesk, A. M., ed., 1988; Smith, D. W., ed., 1993; Griffin and Griffin, eds., 1994; von Heinje, 
1987; and Gribskov and Devereux, eds., 1991). Methods commonly employed to determine 
identity between two sequences include, but are not limited to those disclosed in Guide to 
Huge Computers, Martin J. Bishop, ed., Academic Press, San Diego, 1994, and Carillo and 
Lipman, 1988. Preferred methods to determine identity are designed to give the largest 
match between the two sequences tested. Such methods are codified in computer programs. 
Preferred computer program methods to determine identity between two sequences include, 
but are not limited to, GCG (Genetics Computer Group, Madison Wis.) program package 
(Devereux, et al., 1984), BLASTP, BLASTN, FASTA (Altschul, et al., 1990; Altschul, et al., 
1997). The well-known Smith Waterman algorithm may also be used to determine identity. 
[0074] As an illustration, by a polynucleotide having a nucleotide sequence having at 

least, for example, 95% "identity" to a reference nucleotide sequence of is intended that the 
nucleotide sequence of the polynucleotide is identical to the reference sequence except that 
the polynucleotide sequence may include up to five point mutations per each 100 nucleotides 
of the reference nucleotide sequence. In other words, to obtain a polynucleotide having a 
nucleotide sequence at least 95% identical to a reference nucleotide sequence, up to 5% of the 
nucleotides in the reference sequence may be deleted or substituted with another nucleotide, 
or a number of nucleotides up to 5% of the total nucleotides in the reference sequence may be 
inserted into the reference sequence. These mutations of the reference sequence may occur at 
the 5 or 3 terminal positions of the reference nucleotide sequence or anywhere between those 
terminal positions, interspersed either individually among nucleotides in the reference 
sequence or in one or more contiguous groups within the reference sequence. 
[0075] Alternatively, substantial homology or (similarity) exists when a nucleic acid 

or fragment thereof will hybridize to another nucleic acid (or a complementary strand 
thereof) under selective hybridization conditions, to a strand, or to its complement. 
Selectivity of hybridization exists when hybridization which is substantially more selective 
than total lack of specificity occurs. Typically, selective hybridization will occur when there 
is at least about 55% homology over a stretch of at least about 14 nucleotides, preferably at 
least about 65%, more preferably at least about 75%, and most preferably at least about 90%. 
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The length of homology comparison, as described, may be over longer stretches, and in 
certain embodiments will often be over a stretch of at least about nine nucleotides, usually at 
least about 20 nucleotides, more usually at least about 24 nucleotides, typically at least about 
28 nucleotides, more typically at least about 32 nucleotides, and preferably at least about 36 
or more nucleotides. 

[0076] Nucleic acid hybridization will be affected by such conditions as salt 

concentration, temperature, or organic solvents, in addition to the base composition, length of 
the complementary strands, and the number of nucleotide base mismatches between the 
hybridizing nucleic acids, as will be readily appreciated by those skilled in the art. Stringent 
temperature conditions will generally include temperatures in excess of 30C, typically in 
excess of 37C, and preferably in excess of 45C. Stringent salt conditions will ordinarily 
be less than 1000 mM, typically less than 500 mM, and preferably less than 200 mM. 
However, the combination of parameters is much more important than the measure of any 
single parameter. The stringency conditions are dependent on the length of the nucleic acid 
and the base composition of the nucleic acid, and can be determined by techniques well 
known in the art. See, e.g., Ausubel, 1987; Wetmur and Davidson, 1968. 
[0077] Thus, as herein used, the term "stringent conditions 11 means hybridization 

will occur only if there is at least 95% and preferably at least 97% identity between the 
sequences. Such hybridization techniques are well known to those of skill in the art. 
Stringent hybridization conditions are as defined above or, alternatively, conditions under 
overnight incubation at 42° C in a solution comprising: 50% formamide, 5x SSC (150 mM 
NaCl, 15 mM trisodium citrate), 50 mM sodium phosphate (pH7.6), 5x Denhardfs solution, 
10% dextran sulfate, and 20 \ig/m\ denatured, sheared salmon sperm DNA, followed by 
washing the filters in 0.1 x SSC at about 65°C. 

[0078] The terms "isolated", "substantially pure", and "substantially 

homogeneous" are used interchangeably to describe a protein or polypeptide which has been 
separated from components which accompany it in its natural state. A monomelic protein is 
substantially pure when at least about 60 to 75% of a sample exhibits a single polypeptide 
sequence. A substantially pure protein will typically comprise about 60 to 90% W/W of a 
protein sample, more usually about 95%, and preferably will be over about 99% pure. 
Protein purity or homogeneity may be indicated by a number of means well known in the art, 
such as polyacrylamide gel electrophoresis of a protein sample, followed by visualizing a 
single polypeptide band upon staining the gel. For certain purposes, higher resolution may be 
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provided by using HPLC or other means well known in the art which are utilized for 
purification. 

[0079] Large amounts of the nucleic acids of the present invention may be produced 

by (a) replication in a suitable host or transgenic animals or (b) chemical synthesis using 
techniques well known in the art. Constructs prepared for introduction into a prokaryotic or 
eukaryotic host may comprise a replication system recognized by the host, including the 
intended polynucleotide fragment encoding the desired polypeptide, and will preferably also 
include transcription and translational initiation regulatory sequences operably linked to the 
polypeptide encoding segment. Expression vectors may include, for example, an origin of 
replication or autonomously replicating sequence (ARS) and expression control sequences, a 
promoter, an enhancer and necessary processing information sites, such as ribosome-binding 
sites, RNA splice sites, polyadenylation sites, transcriptional terminator sequences, and 
mRNA stabilizing sequences. Secretion signals may also be included where appropriate 
which allow the protein to cross and/or lodge in cell membranes, and thus attain its functional 
topology, or be secreted from the cell. Such vectors may be prepared by means of standard 
recombinant techniques well known in the art. 

[0080] "Codon degeneracy" refers to the nature in the genetic code permitting 

variation of the nucleotide sequence without effecting the amino acid sequence of an encoded 
polypeptide. Accordingly, the instant invention relates to any nucleic acid molecule that 
encodes all or a substantial portion of the amino acid sequence encoding the instant C3H 
polypeptides as set forth in SEQ ID NOs:2 and 4. The skilled artisan is well aware of the 
"codon-bias" exhibited by a specific host cell in usage of nucleotide codons to specify a 
given amino acid. Therefore, when synthesizing a gene for improved expression in a host 
cell, it is desirable to design the gene such that its frequency of codon usage approaches the 
frequency of preferred codon usage of the host cell. 

[0081] "Synthetic genes" can be assembled from oligonucleotide building blocks 

that are chemically synthesized using procedures known to those skilled in the art. These 
building blocks are ligated and annealed to form gene segments which are then enzymatically 
assembled to construct the entire gene. "Chemically synthesized", as related to a sequence 
of DNA, means that the component nucleotides were assembled in vitro. Manual chemical 
synthesis of DNA may be accomplished using well established procedures, or automated 
chemical synthesis can be performed using one of a number of commercially available 
machines. Accordingly, the genes can be tailored for optimal gene expression based on 
optimization of nucleotide sequence to reflect the codon bias of the host cell. The skilled 
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artisan appreciates the likelihood of successful gene expression if codon usage is biased 
towards those codons favored by the host. Determination of preferred codons can be based 
on a survey of genes derived from the host cell where sequence information is available. 
[0082] "Gene" refers to a nucleic acid molecule that expresses a specific protein, 

including regulatory sequences preceding (5 1 non-coding sequences) and following (3 f non- 
coding sequences) the coding sequence. "Native gene" refers to a gene as found in nature 
with its own regulatory sequences. "Chimeric gene" refers to any gene that is not a native 
gene, comprising regulatory and coding sequences that are not found together in nature. 
Accordingly, a chimeric gene may comprise regulatory sequences and coding sequences that 
are derived from different sources, or regulatory sequences and coding sequences derived 
from the same source, but arranged in a manner different than that found in nature. 
"Endogenous gene" refers to a native gene in its natural location in the genome of an 
organism. A "foreign" gene refers to a gene or gene copy that was not originally found in the 
host organism, but that is introduced into the host organism by gene transfer. Foreign genes 
can comprise native genes inserted into a non-native organism, additional copies of a native 
gene inserted into a native organism or chimeric genes. A "transgene" is a gene that has been 
introduced into the genome by a transformation procedure. 

[0083] "Coding sequence" refers to a DNA sequence that codes for a specific amino 

acid sequence. "Suitable regulatory sequences" refer to nucleotide sequences located 
upstream (5* non-coding sequences), within, or downstream (3* non-coding sequences) of a 
coding sequence, and which influence the transcription, RNA processing or stability, or 
translation of the associated coding sequence. Regulatory sequences may include promoters, 
translation leader sequences, introns, polyadenylation recognition sequences, RNA 
processing site, effector binding site and stem-loop structure. 

[0084] "Promoter" refers to a DNA sequence capable of controlling the expression 

of a coding sequence or functional RNA. In general, a coding sequence is located 3' to a 
promoter sequence. Promoters may be derived in their entirety from a native gene, or be 
composed of different elements derived from different promoters found in nature, or even 
comprise synthetic DNA segments. It is understood by those skilled in the art that different 
promoters may direct the expression of a gene in different tissues or cell types, or at different 
stages of development, or in response to different environmental or physiological conditions. 
Promoters which cause a gene to be expressed in most cell types at most times are commonly 
referred to as "constitutive promoters". It is further recognized that since in most cases the 
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exact boundaries of regulatory sequences have not been completely defined, DNA fragments 
of different lengths may have identical promoter activity. 

[0085] The "3' non-coding sequences" refer to DNA sequences located downstream 

of a coding sequence and include polyadenylation recognition sequences and other sequences 
encoding regulatory signals capable of affecting mRNA processing or gene expression. The 
polyadenylation signal is usually characterized by affecting the addition of polyadenylic acid 
tracts to the 3' end of the mRNA precursor. 

[0086] "RNA transcript" refers to the product resulting from RNA polymerase- 

catalyzed transcription of a DNA sequence. When the RNA transcript is a perfect 
complementary copy of the DNA sequence, it is referred to as the primary transcript or it may 
be a RNA sequence derived from posttranscriptional processing of the primary transcript and 
is referred to as the mature RNA. "Messenger RNA (mRNA)" refers to the RNA that is 
without introns and that can be translated into protein by the cell. "cDNA" refers to a double- 
stranded DNA that is complementary to and derived from mRNA. "Sense" RNA refers to 
RNA transcript that includes the mRNA and so can be translated into protein by the cell. 
"Antisense RNA" refers to a RNA transcript that is complementary to all or part of a target 
primary transcript or mRNA and that blocks the expression of a target gene (U.S. Patent 
No. 5,107,065;WO 9928508). The complementarity of an antisense RNA may be with any 
part of the specific gene transcript, i.e., at the 5* non-coding sequence, 3* non-coding 
sequence, or the coding sequence. "Functional RNA" refers to antisense RNA, ribozyme 
RNA, or other RNA that is not translated yet has an effect on cellular processes. 
[0087] The term "expression", as used herein, refers to the transcription and stable 

accumulation of sense (mRNA) or antisense RNA derived from the nucleic acid molecule of 
the invention. Expression may also refer to translation of mRNA into a polypeptide. 
[0088] "Mature" protein refers to a post-translationally processed polypeptide; i.e., 

one from which any pre- or propeptides present in the primary translation product have been 
removed. "Precursor" protein refers to the primary product of translation of mRNA; i.e., with 
pre- and propeptides still present. Pre- and propeptides may be but are not limited to 
intracellular localization signals such as transit peptides. 

[0089] A "signal peptide" is an amino acid sequence that is translated in conjunction 

with a protein and directs the protein across cell membranes of the cell in which the protein is 
made. For example, a signal peptide can be used to direct a mature enzyme into a cell's 
chloroplast or into a cell's vacuole via endoplasmic reticulum. A signal peptide is also 
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referred to as a signal protein. "Signal sequence" refers to a nucleotide sequence that 
encodes a signal peptide. 

[0090] The signal peptide is covalently bound to the "mature enzyme" or 

"passenger enzyme." The term "precursor protein" identifies a polypeptide having a signal 
peptide and a passenger peptide covalently attached to each other. Typically, the carboxy 
terminus of the signal peptide is covalently attached to the amino terminus of the passenger 
peptide. The passenger peptide and signal peptide can be encoded by the same gene locus, 
that is, homologous to each other, in that they are encoded in a manner isolated from a single 
source. Alternatively, the signal peptide and passenger peptide can be heterologous to each 
other, i.e., the signal peptide and passenger peptide can be from different genes and/or 
different organisms. The transit peptide may be derived from monocotyledonous or 
dicotyledonous plants upon choice of the artisan. The term "signal peptide" includes amino 
acid sequences that are translated in conjunction with a protein and directs the protein to the 
secretory system (Chrispeels, 1991). If the protein is to be directed to a vacuole, a vacuolar 
targeting signal {supra) can further be added, or if to the endoplasmic reticulum, an 
endoplasmic reticulum retention signal {supra) may be added. If the protein is to be directed 
to the nucleus, any signal peptide present should be removed and instead a nuclear 
localization signal included (Raikhel, 1992). 

[0091] By "mature peptide" or "passenger peptide" is meant a polypeptide which is 

found after processing and passing into an organelle and which is functional in the organelle 
for its intended purpose. Passenger peptides are originally made in a precursor form that 
includes a signal peptide and the passenger peptide. Upon entry into an organelle, the signal 
peptide portion is cleaved, thus leaving the "passenger" or "mature" peptide. Passenger 
peptides are the polypeptides typically obtained upon purification from a homogenate, the 
sequence of which can be determined as described herein. 

[0092] "Transformation" refers to the transfer of a nucleic acid molecule into the 

genome of a host organism, resulting in genetically stable inheritance. Host organisms 
containing the transformed nucleic acid molecules are referred to as "transgenic" or 
"recombinant" or "transformed" organisms. 

[0093] As used herein, "transgenic plant" includes reference to a plant which 

comprises within its genome a foreign polynucleotide. Generally, the foreign polynucleotide 
is stably integrated within the genome such that the polynucleotide is passed on to successive 
generations. The foreign polynucleotide may be integrated into the genome alone or as part 
of a recombinant expression cassette. "Transgenic" is used herein to include any cell, cell 
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line, callus, tissue, plant part or plant, the genotype of which has been altered by the presence 
of foreign nucleic acid including those transgenics initially so altered as well as those created 
by sexual crosses or asexual propagation from the initial transgenic. The term "transgenic" as 
used herein does not encompass the alteration of the genome (chromosomal or extra- 
chromosomal) by conventional plant breeding methods or by naturally occurring events such 
as random cross-fertilization, non-recombinant viral infection, non-recombinant bacterial 
transformation, non-recombinant transposition, or spontaneous mutation. 
[0094] The terms "plasmid", "vector" and "cassette" refer to an extra chromosomal 

element often carrying genes which are not part of the central metabolism of the cell, and 
usually in the form of circular double-stranded DNA molecules. Such elements may be 
autonomously replicating sequences, genome integrating sequences, phage or nucleotide 
sequences, linear or circular, of a single- or double-stranded DNA or RNA, derived from any 
source, in which a number of nucleotide sequences have been joined or recombined into a 
unique construction which is capable of introducing a promoter fragment and DNA sequence 
for a selected gene product along with appropriate 3* untranslated sequence into a cell. 
"Transformation cassette" refers to a specific vector containing a foreign gene and having 
elements in addition to the foreign gene that facilitate transformation of a particular host cell. 
"Expression cassette" refers to a specific vector containing a foreign gene and having 
elements in addition to the foreign gene that allow for enhanced expression of that gene in a 
foreign host. 

[0095] The term "sequence analysis software" refers to any computer algorithm or 

software program that is useful for the analysis of nucleotide or amino acid sequences. 
"Sequence analysis software" may be commercially available or independently developed. 
Typical sequence analysis software will include but is not limited to the GCG suite of 
programs (Wisconsin Package Version 9.0, Genetics Computer Group (GCG), Madison, WI), 
BLASTP, BLASTN, BLASTX (Altschul, et al., 1990), and DNASTAR (DNASTAR, Inc. 
1228 S. Park St. Madison, WI 53715 USA). Within the context of this application it will be 
understood that where sequence analysis software is used for analysis, that the results of the 
analysis will be based on the "default values" of the program referenced, unless otherwise 
specified. As used herein "default values" will mean any set of values or parameters which 
originally load with the software when first initialized. 

[0096] In one aspect of the invention, sense and antisense nucleic acid fragments that 
encode C3H or c3h, and nucleic acids fragments substantially similar thereto. 
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[0097] It is of course not intended that the present invention be limited to these 

exemplary nucleotide sequences, but the invention also encompasses nucleic acid fragments 
substantially similar to those set forth above. In a preferred aspect, the present invention 
provides nucleic acid fragments that encode polypeptides in accordance with the invention 
that have at least about 80% identity to the amino acid sequence of SEQ ID NO:4, SEQ ID 
NO:5, SEQ ID NO:6 or SEQ ID NO:7, more preferably, at least about 90% identity to one 
these sequences and most preferably at least about 95% identity. Similarly, preferred nucleic 
acid sequences corresponding to the instant re/S gene are at least 80% identical to one of the 
nucleic acid sequences of reported herein. More preferred nucleic acid fragments are at least 
90% identical to one of the sequences herein. Most preferred are nucleic acid fragments that 
are at least 95% identical to one of the nucleic acid fragments reported herein. 
[0098] In another aspect, the present invention relates to methods and compositions for 

obtaining transformed cells, said cells expressing c3h. In this regard, inventive nucleotide 
sequences can be incorporated into vectors, which in turn can be used to transform cells. 
Expression of c3h results in the cell having altered metabolic activity relative to 
nontransformed cells. Transformants harboring an expressible inventive nucleotide sequence 
demonstrate increased levels of activity when appropriate substrates are available, and have 
other desirable features as would occur to a person of ordinary skill in the art. These and other 
features of the invention are described in further detail below. 

[0099] Inventive DNA sequences can be incorporated into the genome of a plant or 

microorganism using conventional recombinant DNA technology, thereby making a 
transformed plant or microorganism that expresses c3h. As described above, the term 
"genome" as used herein is intended to refer to DNA which is present in a plant or 
microorganism and which is heritable by progeny during propagation thereof. As such, an 
inventive transformed plant or microorganism may alternatively be produced by producing 
Fl or higher generation progeny of a directly transformed plant or microorganism, wherein 
the progeny comprise the foreign nucleotide sequence. Transformed plants or 
microorganisms and progeny thereof are all contemplated by the invention and are all 
intended to fall directly within the meaning of the terms "transformed plant" and 
"transformed microorganism." 

[0100] In this manner, the present invention contemplates the use of transformed 

plants that are selfed to produce an inbred plant. The inbred plant produces seed containing 
the gene of interest. These seeds can be grown to produce plants that express the polypeptide 
of interest. The inbred lines can also be crossed with other inbred lines to produce hybrids. 
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Parts obtained from the regenerated plant, such as flowers, seeds, leaves, branches, fruit, and 
the like are covered by the invention provided that said parts contain genes encoding and/or 
expressing the protein of interest. Progeny and variants, and mutants of the regenerated 
plants are also included within the scope of the invention. 

[0101] In diploid plants, typically one parent may be transformed and the other parent 

is the wild type. After crossing the parents, the first generation hybrids (Fl) are selfed to 
produce second generation hybrids (F2). Those plants exhibiting the highest levels of the 
expression can then be chosen for further breeding. 

[0102] Standard recombinant DNA and molecular cloning techniques used in 

accordance with the present invention are well known in the art. See, e.g., Sambrook, J., 
Fritsch, E. F. and Maniatis, T., Molecular Cloning: A Laboratory Manual . Second Edition, 
Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY (1989) (hereinafter 
"Maniatis"); by Silhavy, et al. (1984); and by Ausubel, et al (1992); Sambrook et al. (1989); 
Glover (1985); Guthrie and Fink (1991); Weissbach and Weissbach (1986); Zaitlin et al. 
(1985) and Gelvin et al. (1990). 

Method of Use: Recombinant Microbial Expression of Mutant ref8 

[0103] It will be useful to recombinantly express the re/8 gene for the production of 

mutant C3H with altered biological activity in heterologous host cells, particularly in the cells 
of microbial hosts, to produce large amounts of the mutant c3h enzyme. 
[0104] Preferred heterologous host cells for express of the instant genes and nucleic 

acid molecules are microbial hosts. Specific suitable hosts include but are not limited 
Aspergillus, Trichoderma, Saccharomyces, Pichia, Candida, Hansenula, Salmonella, 
Bacillus, Acinetobacter, Rhodococcus, Streptomyces, Escherichia, and Pseudomonas, where 
E. coli is most preferred. 

[0105] Microbial expression systems and expression vectors containing regulatory 

sequences that direct high level expression of foreign proteins are well known to those skilled 
in the art. A wide variety of such systems and vectors could be used to construct chimeric 
genes for production of the gene products of the instant sequences. These chimeric genes 
could then be introduced into appropriate microorganisms via transformation to provide high 
level expression of the enzymes. 

[0106] Vectors or cassettes useful for the transformation of suitable host cells are well 

known in the art. Typically the vector or cassette contains sequences directing transcription 
and translation of the relevant gene, a selectable marker, and sequences allowing autonomous 
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replication or chromosomal integration. Suitable vectors comprise a region 5 f of the gene 
which harbors transcriptional initiation controls and a region 3 1 of the DNA fragment which 
controls transcriptional termination. Both control regions are preferably derived from genes 
homologous to the transformed host cell, although it is to be understood that such control 
regions need not be derived from the genes native to the specific species chosen as a 
production host. 

[0107] Initiation control regions or promoters, which are useful to drive expression of 

the instant genes in the desired host cell are numerous and familiar to those skilled in the art. 
Virtually any promoter capable of driving these genes is suitable for the present invention 
including but not limited to CYC1, HIS3, GAL1, GAL 10, ADH1, PGK, PHOS, GAPDH, 
ADC1, TRP1, URA3, LEU2, ENO, TPI (useful for expression in Saccharomyces); AOX1 
(useful for expression in Pichia)\ and lac, ara, tet, tip, IPl, IPr, T7, tac, and trc (useful for 

expression in Escherichia coli) as well as the amy, apr, npr promoters and various phage 
promoters useful for expression in Bacillus. 

[0108] Termination control regions may also be derived from various genes native to 

the preferred hosts. Optionally, a termination site may be unnecessary, however, it is most 
preferred if included. 

[0109] It is readily understood that, in the case of transforming prokaryotes, it is not 
necessary to include a signal peptide in the coding region of the vector. Rather, an inventive 
DNA construct for transforming, for example, bacteria, may be made by simply attaching a 
start codon directly to, and in the proper reading frame with, a nucleic acid fragment 
encoding a mature peptide. Of course, other elements are preferably present as described 
herein, such as a promoter upstream of the start codon and a termination sequence 
downstream of the coding region. 

[0110] Optionally it may be desired to produce the instant gene product as a secretion 

product of the transformed host. Secretion of desired proteins into the growth media has the 
advantages of simplified and less costly purification procedures. It is well known in the art 
that secretion signal sequences are often useful in facilitating the active transport of 
expressible proteins across cell membranes. The creation of a transformed host capable of 
secretion may be accomplished by the incorporation of a DNA sequence that codes for a 
secretion signal that is functional in the host production host. Methods for choosing 
appropriate signal sequences are well known in the art (see for example EP 546049;WO 
9324631). The secretion signal DNA or facilitator may be located between the expression- 
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controlling DNA and the instant gene or gene fragment, and in the same reading frame with 
the latter. 

Method of Use: Expression of Mutant ref8 in Transgenic Plants 

[0111] The re/8 gene may be used to create transgenic plants having the ability to 

express mutant c3h. Transgenic plants expressing a functioning re/8 gene exhibit 
modifications in their secondary metabolite profile. 

[0112] Preferred plant hosts will be any variety that will support a high production 

level of the mutant c3h proteins. Suitable green plants include but are not limited to soybean, 
rapeseed (Brassica napus, B. campestris), sunflower [Helianthus annus), cotton (Gossypium 
hirsutum), corn, tobacco (Nicotiana tabacum), alfalfa (Medicago sativa), wheat (Triticum sp), 
barley (Hordeum vulgare), oats (Avena sativa, L), sorghum (Sorghum bicolor), rice (Oryza 
sativd), Arabidopsis, cruciferous vegetables (broccoli, cauliflower, cabbage, etc.), melons, 
carrots, celery, parsley, tomatoes, potatoes, strawberries, peanuts, grapes, grass seed crops, 
sugar beets, sugar cane, beans, peas, rye, flax, hardwood trees, softwood trees, and forage 
grasses 

[0113] The present invention further provides recombinant expression cassettes 

comprising the re/8 coding region. A recombinant expression cassette will typically 
comprise an refS polynucleotide of the present invention operably linked to transcriptional 
initiation regulatory sequences which will direct the transcription of the gene in the intended 
host cell, such as tissues of a transformed plant. An expression vector according to the 
invention may be either naturally or artificially produced from parts derived from 
heterologous sources, which parts may be naturally occurring or chemically synthesized, and 
wherein the parts have been joined by ligation or other means known in the art. The 
introduced coding sequence is preferably under control of a promoter and thus will be 
generally downstream from the promoter. Stated alternatively, the promoter sequence will be 
generally upstream (i.e., at the 5' end) of the coding sequence. The phrase "under control of 
contemplates the presence of such other elements as may be necessary to achieve 
transcription of the introduced sequence. As such, in one representative example, enhanced 
production of mutant C3H polypeptide may be achieved by inserting an inventive nucleotide 
sequence in a vector downstream from and operably linked to a promoter sequence capable of 
driving expression in a host cell. Two DNA sequences (such as a promoter region sequence 
and an C3H-encoding nucleotide sequence) are said to be operably linked if the nature of the 
linkage between the two DNA sequences does not (1) result in the introduction of a frame- 
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shift mutation, (2) interfere with the ability of the promoter region sequence to direct the 
transcription of the desired nucleotide sequence, or (3) interfere with the ability of the desired 
nucleotide sequence to be transcribed by the promoter region sequence. 
[0114] For example, plant expression vectors may include (1) a cloned plant gene 

under the transcriptional control of 5* and 3' regulatory sequences and (2) a dominant 
selectable marker. Such plant expression vectors may also contain, if desired, a promoter 
regulatory region (e.g., one conferring inducible or constitutive, environmentally- or 
developmentally-regulated, or cell- or tissue-specific/selective expression), a transcription 
initiation start site, a ribosome binding site, an RNA processing signal, a transcription 
termination site, and/or a polyadenylation signal. 

[0115] A plant promoter fragment can be employed which will direct expression of re/8 

in all tissues of a regenerated plant. Such promoters are referred to herein as "constitutive" 
promoters and are active under most environmental conditions and states of development or 
cell differentiation. Examples of constitutive promoters include the cauliflower mosaic virus 
(CaMV) 35S transcription initiation region, the 1'- or 2 - promoter derived from T-DNA of 
Agrobacterium tumefaciens, the ubiquitin 1 promoter, the Smas promoter, the cinnamyl alcohol 
dehydrogenase promoter (U.S. Patent No. 5,683,439), the Nos promoter, the pEmu promoter, 
the rubisco promoter, and the GRP1-8 promoter. 

[0116] Alternatively, the plant promoter can direct expression of a polynucleotide of 

the present invention in a specific tissue or may be otherwise under more precise 
environmental or developmental control. Such promoters are referred to here as "inducible" 
promoters. Environmental conditions that may effect transcription by inducible promoters 
include pathogen attack, anaerobic conditions, or the presence of light. Examples of 
inducible promoters are the Adhl promoter, which is inducible by hypoxia or cold stress, the 
Hsp70 promoter, which is inducible by heat stress, and the PPDK promoter, which is 
inducible by light. 

[0117] Examples of promoters under developmental control include promoters that 

initiate transcription only, or preferentially, in certain tissues, such as leaves, roots, fruit, 
seeds, or flowers. Exemplary promoters include the anther specific promoter 5126 (U.S. 
Patent Nos. 5,689,049 and 5,689,051), glob-1 promoter, and gamma-zein promoter. The 
operation of a promoter may also vary depending on its location in the genome. Thus, an 
inducible promoter may become fully or partially constitutive in certain locations. 
[0118] Both heterologous and non-heterologous (i.e., endogenous) promoters can be 

employed to direct expression of the re/8 gene. These promoters can also be used, for 
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example, in recombinant expression cassettes to drive expression of antisense nucleic acids to 
provide, reduce, increase, or alter concentration and/or composition of the C3H protein in a 
desired tissue. Thus, in some embodiments, the nucleic acid construct will comprise a 
promoter functional in a plant cell, such as in Zea mays or tobacco, operably linked to refS 
gene. Promoters useful in these embodiments include endogenous promoters driving 
expression of C3H. 

[0119] Where c3h polypeptide expression is desired, it is generally desirable to 

include a polyadenylation region at the 3 '-end of a polynucleotide coding region of ref8. The 
polyadenylation region can be derived from the wild-type gene, from a variety of other plant 
genes, or from T-DNA. The 3' end sequence to be added can be derived from, for example, 
the nopaline synthase or octopine synthase genes, or alternatively from another plant gene, or 
less preferably from any other eukaryotic gene. 

[0120] An intron sequence can be added to the 5' untranslated region or the coding 

sequence or the partial coding sequence to increase the amount of the mature message that 
accumulates in the cytosol. Inclusion of a spliceable intron in the transcription unit in both 
plant and animal expression constructs has been shown to increase gene expression at both the 
mRNA and protein levels up to 1000-fold. Buchman and Berg (1988); Callis, et al (1987). 
Such intron enhancement of gene expression is typically greatest when placed near the 5 ' end 
of the transcription unit. Use of maize introns Adhl-S intron 1, 2, and 6, the Bronze- 1 intron 
are known in the art. See generally, The Maize Handbook, Chapter 116, Freeling and Walbot, 
Eds. (1994). The vector comprising the REF8 sequence will typically comprise a marker gene 
which confers a selectable phenotype on plant cells. Typical vectors useful for expression of 
genes in higher plants are well known in the art and include vectors derived from the tumor- 
inducing (Ti) plasmid of Agrobacterium tumefaciens described by Rogers, et al. (1987). 
[0121] ref8 can be expressed in either sense or anti-sense orientation as desired. It 

will be appreciated that control of gene expression in either sense or anti-sense orientation 
can have a direct impact on the observable plant characteristics. Antisense technology can be 
conveniently used to inhibit C3H gene expression in plants. To accomplish this, re/8 or a 
portion of re/8 is cloned and operably linked to a promoter such that the anti-sense strand of 
RNA will be transcribed. The construct is then transformed into plants and the antisense 
strand of RNA is produced. In plant cells, it has been shown that antisense RNA inhibits 
gene expression by preventing the accumulation of mRNA which encodes the enzyme of 
interest, see, e.g., Sheehy, et al (1988); and Hiatt et al, U.S. Patent No. 4,801,340. 
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[0122] Another method of suppression is sense suppression (i.e., co-suppression). 

Introduction of nucleic acid configured in the sense orientation has been shown to be an 
effective means by which to block the transcription of target genes. For an example of the 
use of this method to modulate expression of endogenous genes see, Napoli et al (1990) and 
U.S. Patent No. 5,034,323. Such a method may be applied to the regulation of C3H 
expression. 

[0123] Catalytic RNA molecules or ribozymes can also be used to inhibit expression 

of plant genes. It is possible to design ribozymes that specifically pair with virtually any 
target RNA and cleave the phosphodiester backbone at a specific location, thereby 
functionally inactivating the target RNA. In carrying out this cleavage, the ribozyme is not 
itself altered, and is thus capable of recycling and cleaving other molecules, making it a true 
enzyme. The inclusion of ribozyme sequences within antisense RNAs confers RNA-cleaving 
activity upon them, thereby increasing the activity of the constructs. The design and use of 
target RNA-specific ribozymes is described in Haseloff, et al. (1988). 

[0124] To introduce re/8 into a plant, generally the gene will first be incorporated into 

a recombinant expression cassette or vector, by a variety of methods known in the art. See, 
for example, Weising, et al (1988). For example, the DNA construct may be introduced 
directly into the genomic DNA of the plant cell using techniques such as electroporation, 
polyethylene glycol (PEG), poration, particle bombardment, silicon fiber delivery, or 
microinjection of plant cell protoplasts or embryogenic callus. See, e.g., Tomes, et al., Direct 
DNA Transfer into Intact Plant Cells Via Microprojectile Bombardment, pp. 197-213 in Plant 
Cell, Tissue and Organ Culture, Fundamental Methods, eds. O. L. Gamborg and G.C. 
Phillips. Springer- Verlag Berlin Heidelberg New York, 1995. The introduction of DNA 
constructs using PEG precipitation is described in Paszkowski, et al (1984). Electroporation 
techniques are described in Fromm, et al (1985). Ballistic transformation techniques are 
described in Klein, et al (1987). 

[0125] Alternatively, Agrobacterium tumefaciens-medisded transformation techniques 

may be used. See, for example Horsch, et al. (1984); Fraley, et al, (1983); and, Plant 
Molecular Biology: A Laboratory Manual, Chapter 8, Clark, Ed. (1997). The DNA 
constructs may be combined with suitable T-DNA flanking regions and introduced into a 
conventional Agrobacterium tumefaciens host vector. The virulence functions of the 
Agrobacterium tumefaciens host will direct the insertion of the construct and adjacent marker 
into the plant cell DNA when the cell is infected by the bacteria. See, U.S. Patent No. 
5,591,616. Although Agrobacterium is useful primarily in dicots, certain monocots can be 
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transformed by Agrobacterium: For instance, Agrobacterium transformation of maize is 
described in U.S. Patent No. 5,550,318. 

[0126] Other methods of transfection or transformation include (1) Agrobacterium 

rhizogenes-mcdisited transformation (see, e.g., Lichtenstein and Fuller, 1987; and 
Lichtenstein and Draper, 1985), Application PCT/US87/02512 (WO 88/02405 published 
Apr. 7, 1988) describes the use of A, rhizogenes strain A4 and its Ri plasmid along with A, 
tumefaciens vectors pARC8 or pARC16 (2) liposome-mediated DNA uptake (see, e.g., 
Freeman, et ah, 1984), (3) the vortexing method (see, e.g., Kindle, 1990). 
[0127] DNA can also be introduced into plants by direct DNA transfer into pollen as 

described by Zhou, et ah, 1983; Hess, 1987; Luo, et al, 1988). Expression of REF8 can be 
obtained by injection of the DNA into reproductive organs of a plant as described by Pena, et 
al. (1987). The gene can also be injected directly into the cells of immature embryos and the 
rehydration of desiccated embryos as described by Neuhaus, et al (1987); and Benbrook, et 
aL (1986). A variety of plant viruses that can be employed as vectors are known in the art 
and include cauliflower mosaic virus (CaMV), geminivirus, brome mosaic virus, and tobacco 
mosaic virus. 

[0128] Plant cells that directly result or are derived from the nucleic acid introduction 

techniques can be cultured to regenerate a whole plant that possesses the introduced 
genotype. Such regeneration techniques often rely on manipulation of certain phytohormones 
in a tissue culture growth medium. Plants cells can be regenerated, e.g., from single cells, 
callus tissue or leaf discs according to standard plant tissue culture techniques. It is well 
known in the art that various cells, tissues, and organs from a wide variety of plants can be 
successfully cultured to regenerate an entire plant. Plant regeneration from cultured 
protoplasts is described in Evans, et al (1983); and Binding (1985). 
[0129] The regeneration of plants from either single plant protoplasts or various 

explants is well known in the art. See, for example, Weissbach and Weissbach (eds., 1988). 
This regeneration and growth process includes the steps of selection of transformant cells and 
shoots, rooting the transformant shoots and growth of the plantlets in soil. For maize cell 
culture and regeneration see generally, Freeling and Walbot (Eds., 1994); Sprague and 
Dudley (Eds., 1988). For transformation and regeneration of maize see, Gordon-Kamm, et al. 
(1990). 

[0130] The regeneration of plants containing the polynucleotide of the present 

invention and introduced by Agrobacterium from leaf explants can be achieved as described 
by Horsch, et al. (1985). In this procedure, transformants are grown in the presence of a 
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selection agent and in a medium that induces the regeneration of shoots in the plant species 
being transformed as described by Fraley, et al. (1983). This procedure typically produces 
shoots within two to four weeks and these transformant shoots are then transferred to an 
appropriate root-inducing medium containing the selective agent and an antibiotic to prevent 
bacterial growth. Transgenic plants of the present invention may be fertile or sterile. 
Additional literature describing plant and/or microorganism transformation includes the 
following, each of which is incorporated herein by reference in its entirety: Zhijian Li, et al. 
(1992); Parsons, et al. (1997); Daboussi, et al. (1989); Leung, et al. (1990); Koetter, et al. 
(1990); Strasser et al., "Cloning of yeast xylose reductase and xylitol dehydrogenase genes 
and their use," German patent application (1990); Hallborn, et al. (1991); Becker and 
Guarente (1991); Ammerer (1983); Sarthy, et al. (1987); U.S. Patent Nos. 4,945,050, 
5,141,131, 5,177,010, 5,104,310, 5,149,645, 5,469,976, 5,464,763, 4,940,838, 4,693,976, 
5,591,616, 5,231,019, 5,463,174, 4,762,785, 5,004,863, 5,159,135, 5,302,523, 5,464,765, 
5,472,869, 5,384,253; European Patent Application Nos. 0131624B1, 120516, 159418B1, 
176112, 116718, 290799, 320500, 604662, 627752, 0267159, 0292435; WO 87/06614; WO 
92/09696; and WO 93/21335. 

[0131] Once the recombinant DNA is introduced into the plant tissue, successful 

transformants can be screened using standard techniques such as the use of marker genes, 
e.g., genes encoding resistance to antibiotics. Additionally, the level of expression of the 
foreign DNA may be measured at the transcriptional level, by measuring the amount of 
protein synthesized or by assaying to determine the level of enzyme function in the plant. 
One of skill will recognize that after the recombinant expression cassette is stably 
incorporated in transgenic plants and confirmed to be operable, it can be introduced into other 
plants by sexual crossing. Any of a number of standard breeding techniques can be used, 
depending upon the species to be crossed. In vegetatively propagated crops, mature 
transgenic plants can be propagated by the taking of cuttings or by tissue culture techniques 
to produce multiple identical plants. Selection of desirable transgenics is made and new 
varieties are obtained and propagated vegetatively for commercial use. In seed propagated 
crops, mature transgenic plants can be self-crossed to produce a homozygous inbred plant. 
The inbred plant produces seed containing the newly introduced heterologous nucleic acid. 
These seeds can be grown to produce plants that would produce the selected phenotype. 
Parts obtained from the regenerated plant, such as flowers, seeds, leaves, branches, fruit, and 
the like are included in the invention, provided that these parts comprise cells comprising the 
isolated nucleic acid of the present invention. Progeny and variants, and mutants of the 
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regenerated plants are also included within the scope of the invention, provided that these 
parts comprise the introduced nucleic acid sequences. 

[0132] Transgenic plants expressing a polynucleotide of the present invention can be 

screened for transmission of the nucleic acid of the present invention by, for example, 
standard immunoblot and DNA detection techniques. Expression at the RNA level can be 
determined initially to identify and quantitate expression-positive plants. Standard techniques 
for RNA analysis can be employed and include PCR amplification assays using 
oligonucleotide primers designed to amplify only the heterologous RNA templates and 
solution hybridization assays using heterologous nucleic acid-specific probes. The RNA- 
positive plants can then be analyzed for protein expression by Western immunoblot analysis 
using the specifically reactive antibodies of the present invention. In addition, in situ 
hybridization and immunocytochemistry according to standard protocols can be done using 
heterologous nucleic acid specific polynucleotide probes and antibodies, respectively, to 
localize sites of expression within transgenic tissue. Generally, a number of transgenic lines 
are screened for the incorporated nucleic acid to identify and select plants with the most 
appropriate expression profiles. 



Methods of Use: Isolation of Homologs and Orthologs 

[0133] In another aspect of the invention, the sequence of the REF8 or ref8 genes 

may be used to isolate orthologous genes encoding homologous proteins from other plants, 
which genes, and the expression products thereof, can be readily tested for functionality in 
accordance with the present invention by a person of ordinary skill in the art. The DNA 
identities of full length sequences encoding C3H from sorghum, soybean and sweetgum 
reflect 57 to 73% identity to the coding sequence of CYP98 (Table 1). Those in the art would 
know to clone an ortholog of the REF8 gene and use the sequence thereof to down regulate 
the expression of the respective plant's endogenous C3H gene. 

PERCENT IDENTITY 
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Table 1. Sequence Pair Distances 
A.t_Cyp98CDS . seq( 1 ) 

Sorghum_Cyp98Al .seq(2) 
Soybean_Cyp98A2.seq (3) 
Sweetgum_Cyp98.seq(4) 

Clustal method with weighted residue weight table. 

[0134] It is well known that plants and microorganisms of a wide variety of species 

commonly express and utilize analogous enzymes and/or polypeptides which have varying 
degrees of degeneracy, and yet which effectively provide the same or a similar function. For 
example, an amino acid sequence isolated from one species may differ to a certain degree 
from the sequence set forth in SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6 or SEQ ID NO:7 
and yet have similar functionality. Amino acid sequences comprising such variations, and 
methods for identifying and isolating the same, are included within the scope of the present 
invention. 

[0135] Isolation of homologous genes using sequence-dependent protocols is well 

known in the art. Examples of sequence-dependent protocols include, but are not limited to, 
methods of nucleic acid hybridization, and methods of DNA and RNA amplification as 
exemplified by various uses of nucleic acid amplification technologies (e.g. polymerase chain 
reaction (PCR), Mullis et al., U.S. Patent 4,683,202; ligase chain reaction (LCR), Tabor, et al. 
(1985); or strand displacement amplification (SDA), Walker, et al. (1992)). 
[0136] For example, genes encoding similar proteins or polypeptides to the C3H 

polypeptide could be isolated directly by using all or a portion of the instant nucleic acid 
molecules as DNA hybridization probes to screen libraries from any desired bacteria using 
methodology well known to those skilled in the art. Specific oligonucleotide probes based 
upon the instant re/8 sequences can be designed and synthesized by methods known in the art 
(Maniatis). Moreover, the entire sequences can be used directly to synthesize DNA probes 
by methods known to the skilled artisan such as random primers DNA labeling, nick 
translation, or end-labeling techniques, or RNA probes using available in vitro transcription 
systems. In addition, specific primers can be designed and used to amplify a part of or full- 
length of the instant sequences. The resulting amplification products can be labeled directly 
during amplification reactions or labeled after amplification reactions, and used as probes to 
isolate full length DNA fragments under conditions of appropriate stringency. 
[0137] Generally two short segments of the instant sequences may be used in 

polymerase chain reaction protocols to amplify longer nucleic acid molecules encoding 
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homologous genes from DNA or RNA. The polymerase chain reaction may also be 
performed on a library of cloned nucleic acid molecules wherein the sequence of one primer 
is derived from the instant REF8 nucleic acid molecules, and the sequence of the other primer 
takes advantage of the presence of the polyadenylic acid tracts to the 3' end of the mRNA 
precursor encoding the instant genes. Alternatively, the second primer sequence may be 
based upon sequences derived from the cloning vector. For example, the skilled artisan can 
follow the RACE protocol (Frohman, et al., 1988) to generate cDNAs by using PCR to 
amplify copies of the region between a single point in the transcript and the 3' or 5' end. 
Primers oriented in the 3' and 5 ! directions can be designed from the instant sequences. Using 
commercially available 3' RACE or 5' RACE systems (BRL), specific 3* or 5* cDNA 
fragments can be isolated (Ohara, et al., 1989; Loh, et al., 1989). Typically, in PCR-type 
amplification techniques, the primers have different sequences and are not complementary to 
each other. Depending on the desired conditions, the sequences of the primers should be 
designed to provide for both efficient and faithful replication of the target nucleic acid. 
Methods of PCR primer design are common and well known in the art (Thein and Wallace, 
1986;Rychlik, 1993). 

[0138] Alternatively the instant REF8 and re/S sequences may be employed as 

hybridization reagents for the identification of homologs. The basic components of a nucleic 
acid hybridization test include a probe, a sample suspected of containing the gene or gene 
fragment of interest, and a specific hybridization method. Probes of the present invention are 
typically single stranded nucleic acid sequences that are complementary to the nucleic acid 
sequences to be detected. Probes are "hybridizable" to the nucleic acid sequence to be 
detected. The probe length can vary from 5 bases to tens of thousands of bases, and will 
depend upon the specific test to be done. Typically a probe length of about 1 5 bases to about 
30 bases is suitable. Only part of the probe molecule need be complementary to the nucleic 
acid sequence to be detected. In addition, the complementarity between the probe and the 
target sequence need not be perfect. Hybridization does occur between imperfectly 
complementary molecules with the result that a certain fraction of the bases in the hybridized 
region are not paired with the proper complementary base. 

[0139] Hybridization methods are well defined. Typically the probe and sample must 

be mixed under conditions which will permit nucleic acid hybridization. This involves 
contacting the probe and sample in the presence of an inorganic or organic salt under the 
proper concentration and temperature conditions. The probe and sample nucleic acids must 
be in contact for a long enough time that any possible hybridization between the probe and 
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sample nucleic acid may occur. The concentration of probe or target in the mixture will 
determine the time necessary for hybridization to occur. The higher the probe or target 
concentration the shorter the hybridization incubation time needed. Optionally a chaotropic 
agent may be added. The chaotropic agent stabilizes nucleic acids by inhibiting nuclease 
activity. Furthermore, the chaotropic agent allows sensitive and stringent hybridization of 
short oligonucleotide probes at room temperature (Van Ness and Chen, 1991). Suitable 
chaotropic agents include guanidinium chloride, guanidinium thiocyanate, sodium 
thiocyanate, lithium tetrachloroacetate, sodium perchlorate, rubidium tetrachloroacetate, 
potassium iodide, and cesium trifluoroacetate, among others. Typically, the chaotropic agent 
will be present at a final concentration of about 3M. If desired, one can add formamide to the 
hybridization mixture, typically 30-50% (v/v). 

[0140] Various hybridization solutions can be employed. Typically, these comprise 

from about 20 to 60% volume, preferably 30%, of a polar organic solvent. A common 
hybridization solution employs about 30-50% v/v formamide, about 0.15 to 1M sodium 
chloride, about 0.05 to 0.1 M buffers, such as sodium citrate, Tris-HCl, PIPES or HEPES (pH 
range about 6-9), about 0.05 to 0.2% detergent, such as sodium dodecylsulfate, or between 
0.5-20 mM EDTA, FICOLL (Pharmacia Inc.) (about 300-500 kilo Daltons), 
polyvinylpyrrolidone (about 250-500 kdal), and serum albumin. Also included in the typical 
hybridization solution will be unlabeled carrier nucleic acids from about 0.1 to 5 mg/mL, 
fragmented nucleic DNA, e.g., calf thymus or salmon sperm DNA, or yeast RNA, and 
optionally from about 0.5 to 2% wt./vol. glycine. Other additives may also be included, such 
as volume exclusion agents that include a variety of polar water-soluble or swellable agents, 
such as polyethylene glycol, anionic polymers such as polyacrylate or polymethylacrylate, 
and anionic saccharidic polymers, such as dextran sulfate. 

[0141] Nucleic acid hybridization is adaptable to a variety of assay formats. One of 

the most suitable is the sandwich assay format. The sandwich assay is particularly adaptable 
to hybridization under non-denaturing conditions. A primary component of a sandwich-type 
assay is a solid support. The solid support has adsorbed to it or covalently coupled to it 
immobilized nucleic acid probe that is unlabeled and complementary to one portion of the 
sequence. 

[0142] Availability of the instant nucleotide and deduced amino acid sequences 

facilitates immunological screening DNA expression libraries. Synthetic peptides 
representing portions of the instant amino acid sequences may be synthesized. These 
peptides can be used to immunize animals to produce polyclonal or monoclonal antibodies 
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with specificity for peptides or proteins comprising the amino acid sequences. These 
antibodies can be then be used to screen DNA expression libraries to isolate full-length DNA 
clones of interest (Lerner, 1984; Maniatis). 

[0143] It is also contemplated in accordance with the present invention that REF8 can 

be used to produce gene products having enhanced or altered activity. Various methods are 
known for mutating a native gene sequence to produce a gene product with altered or 
enhanced activity including but not limited to error prone PCR (Melnikov, et al., 1999); site 
directed mutagenesis (Coombs, et al., 1998), and "gene shuffling" (US 5,605,793; US 
5,81 1,238; US 5,830,721; and US 5,837,458, incorporated herein by reference). 
[0144] The method of gene shuffling is particularly attractive due to its facile 

implementation, and high rate of mutagenesis and ease of screening. The process of gene 
shuffling involves the restriction endonuclease cleavage of a gene of interest into fragments 
of specific size in the presence of additional populations of DNA regions of both similarity to 
and difference from the gene of interest. This pool of fragments will then be denatured and 
reannealed to create a mutated gene. The mutated gene is then screened for altered activity. 
[0145] The instant re/8 sequences can be further mutated and screened for altered or 

enhanced activity by this method. The sequences should be double stranded and can be of 
various lengths ranging form 50 bp to lOkb. The sequences can be randomly digested into 
fragments ranging from about 10 bp to 1000 bp, using restriction endonucleases well known 
in the art (Maniatis supra). In addition to the instant re/8 sequences, populations of fragments 
that are hybridizable to all or portions of the re/8 sequence can be added. Similarly, a 
population of fragments that are not hybridizable to the instant re/8 sequence can also be 
added. Typically these additional fragment populations are added in about a 10 to 20 fold 
excess by weight as compared to the total nucleic acid. Generally if this process is followed 
the number of different specific nucleic acid fragments in the mixture will be about 100 to 
about 1000. The mixed population of random nucleic acid fragments are denatured to form 
single-stranded nucleic acid fragments and then reannealed. Only those single-stranded 
nucleic acid fragments having regions of homology with other single-stranded nucleic acid 
fragments will reanneal. The random nucleic acid fragments may be denatured by heating. 
One skilled in the art could determine the conditions necessary to completely denature the 
double stranded nucleic acid. Preferably the temperature is from 80°C to 100°C. The nucleic 
acid fragments may be reannealed by cooling. Preferably the temperature is from 20°C to 
75°C. Renaturation can be accelerated by the addition of polyethylene glycol ("PEG") or 
salt. A suitable salt concentration may range from 0 mM to 200 mM. The annealed nucleic 
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acid fragments are next incubated in the presence of a nucleic acid polymerase and dNTFs 
(i.e. dATP, dCTP, dGTP and dTTP). The nucleic acid polymerase may be the Klenow 
fragment, the Taq polymerase or any other DNA polymerase known in the art. The 
polymerase may be added to the random nucleic acid fragments prior to annealing, 
simultaneously with annealing or after annealing. The cycle of denaturation, renaturation and 
incubation in the presence of polymerase is repeated for a desired number of times. 
Preferably the cycle is repeated from 2 to 50 times, more preferably the sequence is repeated 
from 10 to 40 times. The resulting nucleic acid is a larger double-stranded polynucleotide of 
from about 50 bp to about 100 kb and may be screened for expression and altered activity by 
standard cloning and expression protocol. (Maniatis supra). 
Methods of Use: Altering Phenylpropanoid Metabolism 

[0146] The vectors used to transform plant cells comprise an REF8 nucleic acid or 

portion thereof which is capable of hybridizing with the endogenous REF8 gene of 
Arabidopsis or other species. Such nucleic acids include the sense or antisense strands of the 
ref8 gene or REF8 gene encoding all or part of a protein. In either case, the REF8 nucleic 
acid or its transcript is capable of hybridizing with an endogenous REF8 gene as defined 
herein or its transcript. The conditions under which such hybridization occurs include the 
physiological or equivalent conditions found within plant cells including that found in the 
nucleus and cytoplasm as well as standard in vitro conditions normally used by the skilled 
artisan to determine sequence homology as between two nucleic acids. Such in vitro 
conditions range from moderate (about 5 x SSC at 52°C) to high (about 0.1 x SSC at 65°C) 
stringency conditions. 

[0147] The REF8 gene is used to construct sense or antisense vectors for transforming 

plant cells. The construction of such vectors is facilitated by the use of a binary vector 
which is capable of manipulation and selection in both a plant and a convenient cloning host 
such as a prokaryote. Thus, such a binary vector can include a kanamycin or herbicide 
resistance gene for selection in plant cells and an actinomycin resistance gene for selection in 
a bacterial host. Such vectors, of course, also contain an origin of replication appropriate for 
the prokaryotic host used, and preferably at least one unique restriction site or a polylinker 
containing unique restriction sites to facilitate vector construction. 

[0148] In one embodiment, a constitutive promoter is used to drive expression of the 

REF8 nucleic acid within at least a portion of the reproductive tissues in the recipient plant. 
A particularly preferred promoter is the cauliflower mosaic virus 35 S transcript promoter 
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(Guilley et al. (1982), Odell et al. (1985), and Saunders et al. (1987)). However, other 
constitutive promoters can be used, such as the oc-1 and (3-1 tubulin promoters (Silflow et al. 
(1987)) and the histone promoters (Chaubet et al. (1987)). Other promoters which can be 
used to provide tissue and temporal specificity to the expression of the REF8 nucleic acid 
include xylem-specific promoters. 

[0149] In a further embodiment of the invention, the vector used to transform the 

Arabidopsis cell to produce an Arabidopsis having altered secondary metabolism is 
constructed to target the insertion of the REF8 nucleic acid into an endogenous promoter 
within a plant cell. One type of vector which can be used to target the integration of an REF8 
nucleic acid to an endogenous promoter comprises a positive-negative selection vector 
analogous to that set forth by Monsour et al. (1988), which describes the targeting of 
exogenous DNA to a predetermined endogenous locus in mammalian ES cells. Similar 
constructs utilizing positive and negative selection markers functional in plant cells can be 
readily designed based upon the identification of the endogenous plant promoter and the 
sequence surrounding it (Kempin et al. (1997)). When such an approach is used, it is 
preferred that a replacement-type vector can be used to minimize the likelihood of reversion 
to the wild-type phenotype. 

[0150] The vectors of the invention are designed such that the promoter sequence 

contained in the vector or the promoter sequence targeted in the plant cell genome are 
operably linked to the nucleic acid encoding the REF8 gene. When the positive strand of the 
ref8 gene is used to express all or part of the re/8 protein, the term "operably linked" means 
that the promoter sequence is positioned relative to the coding sequence of the nucleic acid 
such that RNA polymerase is capable of initiating transcription of the ref8 nucleic acid from 
the promoter sequence. In such embodiments it is also preferred to provide appropriate 
ribosome binding sites, transcription initiation and termination sequences, translation 
initiation and termination sequences and polyadenylation sequences to produce a functional 
RNA transcript which can be translated into C3H polypeptide. When an antisense orientation 
of the REF8 nucleic acid is used, all that is required is that the promoter be operably linked to 
transcribe the REF8 antisense strand. Thus, in such embodiments, only transcription start and 
termination sequences are needed to provide an RNA transcript capable of hybridizing with 
the mRNA or other RNA transcript from the endogenous REF8 gene. In addition to 
promoters, other expression regulation sequences, such as enhancers, can be added to the 
vector to facilitate the expression of REF8 nucleic acid in vivo. 
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[0151] Once a vector is constructed, the transformation of plants can be carried out in 

accordance with the invention by essentially any of the various transformation methods 
known to those skilled in the art of plant molecular biology. Such methods are generally 
described in Wu and Grossman (1987). As used herein, the term "transformation" means the 
alteration of the genotype of a plant cell by the introduction of a nucleic acid sequence. 
Particular methods for transformation of plant cells include the direct microinjection of the 
nucleic acid into a plant cell by use of micropipettes. Alternatively, the nucleic acid can be 
transferred into a plant cell by using polyethylene glycol (Paszkowski et al. (1984)). Other 
transformation methods include electroporation of protoplasts (Fromm et al. (1985); infection 
with a plant specific virus, e.g., cauliflower mosaic virus (Hohn et al. (1982)) or use of 
transformation sequences from plant specific bacteria such as Agrobacterium tumefaciens, 
e.g., a Ti plasmid transmitted to a plant cell upon infection by Agrobacterium tumefaciens 
(Horsch et al. (1984); Fraley et al. (1983)). Alternatively, plant cells can be transformed by 
introduction of nucleic acid contained within the matrix or on the surface of small beads or 
particles by way of high velocity ballistic penetration of the plant cell (Klein et al. (1987)). 
The nucleic acid introduced with ballistics may be a chimeric oligonucleotide designed to 
target a small number of mutated bases to a selected segment of the endogenous REF8 gene 
(Beetham et al. (1999)). A small number of mutated bases can also be introduced into a 
selected segment of the endogenous REF8 gene using homologous recombination (Kempin et 
al. (1997)). 

[0152] After the vector is introduced into a plant cell, selection for successful 

transformation is typically carried out prior to regeneration of a plant. Such selection for 
transformation is not necessary, but facilitates the selection of regenerated plants having the 
desired phenotype by reducing wild-type background. Such selection is conveniently based 
upon the antibiotic resistance and/or herbicide resistance genes which may be incorporated 
into the transformation vector. 

[0153] Practically all plants can be regenerated from cultured cells or tissues. As 

used herein, the term "regeneration" refers to growing a whole plant from a plant cell, a 
group of plant cells or a plant part. The methods for plant regeneration are well known to 
those skilled in the art. For example, regeneration from cultured protoplasts is described by 
Evans et al. (1983); and H. Binding (1985). When transformation is of an organ part, 
regeneration can be from the plant callus, explants, organs or parts. Such methods for 
regeneration are also known to those skilled in the art. See, e.g., Wu and Grossman (1987); 
Weissbach and Weissbach (1986); and Klee et al. (1987). 
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[0154] Once plants have been regenerated, one or more plants are selected based 

upon a change in phenylpropanoid metabolism. Such selection can be by TLC, HPLC, GC or 
other means known in the art. 

[0155] Either antisense or co-suppression mechanisms using REF8 nucleic acids can 

result in altered phenylpropanoid metabolism in many species. In addition, plants having 
such altered metabolism can be used as model systems for further study of the 
phenylpropanoid pathway in plants. 

[0156] Alternatively, phenylpropanoid metabolism may be deterred by virus induced 

gene silencing (VIGS) using techniques known in the art (Baulcombe, 1999). Use of VIGS 
suppresses gene expression in plants in a sequence-specific manner by infection with virus 
vectors carrying fragments of host REF8 genes. As another alternative, phenylpropanoid 
metabolism may be modified through the identification of mutans in which the endogenous 
REF8 gene has been inactivated through processes including but not limited to t-DNA tagged 
(Winkler, et al., 1998) or transposon mutagenesis (Hanley, et al., 2000; Enoki, et al., 1999) 
using techniques known in the art. Mutants of this type would be expected to have decreased 
endogenous C3H polypeptide activity, and desirable characteristics like those described for 
the ref8 mutant. 



GENERAL METHODS 
[0157] Standard recombinant DNA and molecular cloning techniques used in the 

Examples are well known in the art and are described by Maniatis and by Silhavy, et al. 
(1984) and by Ausubel, et al. (1987). 

[0158] Materials and methods suitable for the maintenance and growth of bacterial 

cultures are well known in the art. Techniques suitable for use in the following examples 
may be found as set out in Manual of Methods for General Bacteriology (Gerhardt, et al., 
eds., 1994) or by Brock (1989). All reagents, restriction enzymes and materials used for the 
growth and maintenance of bacterial cells were obtained from Aldrich Chemicals 
(Milwaukee, WI), DIFCO Laboratories (Detroit, MI), GIBCO/BRL (Gaithersburg, MD), or 
Sigma Chemical Company (St. Louis, MO) unless otherwise specified. 
[0159] Manipulations of genetic sequences were accomplished using the suite of 

programs available from the Genetics Computer Group Inc. (Wisconsin Package Version 9.0, 
Genetics Computer Group (GCG), Madison, WI). Where the GCG program "Pileup" was 
used the gap creation default value of 12, and the gap extension default value of 4 were used. 
Where the CGC "Gap" or "Bestfit" programs were used the default gap creation penalty of 



42 



50 and the default gap extension penalty of 3 were used. In any case where GCG program 
parameters were not prompted for, in these or any other GCG program, default values were 
used. 

Plant material 

[0160] Arabidopsis thaliana L. Heynh. ecotype Columbia were cultivated at a light 

intensity of 100 mE m" 2 sec _1 at 23°C under a photoperiod of 16 h light/8 h dark in Redi-Earth 
potting mix (Scotts-Sierra Horticultural Products; Marysville, OH). 
Sinapate ester analysis 

[0161] For analysis of sinapate esters, tissue was extracted in 50% methanol 

containing 1 .5% (v/v) acetic acid and analyzed by reverse phase HPLC. Leaf extracts were 
separated on a Microsorb-MV CI 8 column (Ranin Instruments, Woburn, MA) using a 
gradient from 1 .5% phosphoric acid to 35% acetonitrile in 1 .5% phosphoric acid at a flow 
rate of 1 mL min" 1 . Seed extracts were separated on a Puresil CI 8 column (Waters, Milford 
MA) using a gradient from 1.5% acetic acid, 0.05% SDS to 30% acetonitrile in 1.5% acetic 
acid, 0.05% SDS at a flow rate of 1 mL min" 1 . To identify the phenolic component of esters 
accumulated in leaf and seed extracts, samples were hydrolyzed in 1 M NaOH for 2 hr at 
room temperature, acidified, and the liberated hydroxycinnamic acids were extracted into 
ethyl acetate and dried in vacuo. Samples were redissolved in 50% methanol and separated 
on a Microsorb-MV CI 8 column using a gradient from 5% acetic acid to 25% acetonitrile in 
20% acetic acid at a flow rate of 1 mL min" 1 . 
Cell wall analysis 

[0162] For cell wall preparation, Arabidopsis rachis (stem) tissue was ground to a fine/ 
powder and extracted with neutral phosphate buffer, 80% ethanol and acetone (Me^pr'et al., 
1998). Cell wall esterified phenolics were released by saponification with l^FtfaOH for 16h 
at room temperature, and hydrolysis products were extracted in ethyl^e€tate and separated by 
reverse phase HPLC using detection at 320 or 275 nm. To mejarffe lignin content, cell wall 
^samples saponified as described above were analyzedysifig the TGA method (Campbell and 
Ellis, 1J992), or the microscale Klason method fl££ar et al., 1991). Lignin monomer 
composition of saponified cell walls wa^^termined by nitrobenzene oxidation (Meyer et al., 
1998) the DFRC method (Lu andi*£lph, 1997) modified as previously described (Franke et 
aj., 2000), and by Py-GC-^ST(Bocchini et al., 1997). NBO reactions were first extracted 
with dichloromethan&'fo remove nitrobenzene, after which reaction products were extracted 
into diethyl ethefand analyzed by HPLC. DFRC products were quantified by comparison to 
authentic stmidards using GC-MS and selective ion monitoring. 
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[0163] Pyrolysis of the stem tissue was performed using Shimadzu PYR-4A pyrolyzer 
and analyzed by GC-MS (QP-5050A, Shimadzu, Columbia, MD, USA). Pyrolysis was 
performed at 500°C with the column inlet temperature maintained at 300°C. 
Chromatographic separation was performed using a 30 m Rtx-5ms column (0.25 mm i.d.; 
0.25 jim film thickness). The column flow was 1 mL min" 1 with a 1 :100 split ratio. The oven 
temperature program was 40°C for 10 min, 5°C min" 1 ramp to 280°C and 280°C for 5 min 
(total run time of 63 minutes). The GC-MS interface temperature was held at 280°C. The 
mass spectrometric analysis of the column eluent began at 4.00 minutes and concluded at 50 
minutes. All lignin degradation products eluted prior to 45 minutes. The mass analyzer was 
scanned from 50 to 410 Da at a scan speed of 2000 Da sec' 1 . Approximately 0.8 mg of stem 
tissue and 1.6 jig of the internal standard, l,3,5-tri-/-butylbenzene were analyzed. 
[0164] To determine the digestibility of wild-type and re/8 cell walls, tissue samples 
were treated with Driselase (Sigma, St. Louis MO), a crude mixture of endo- and exo- 
glucanases. Solvent-extracted cell walls (100 mg) were suspended in 1% Driselase in 50 mM 
pyridinium acetate buffer, pH 4.7, and incubated for 3 days at 37°C. The undigested cell wall 
residue was precipitated by centrifugation (lOOOxg, 5 min) and the supernatant was 
discarded. The pellet was washed twice with 50 mM pyridinium acetate buffer, pH 4.7 and 
the Driselase digestion was repeated for an additional 2 days. The undigested residue was 
collected by centrifugation, washed with 50 mM pyridinium acetate buffer, water, 3% (w/v) 
SDS in 1% (w/v) ammonium bicarbonate, water, and acetone and was dried overnight at 
room temperature before weighing. 
Histochem is try 

[0165] Six week old rachis internodes were fixed, dehydrated, and embedded in 

Spurr's resin. One jim cross sections were stained with Toluidine blue O and observed under 
bright field microscopy. 
O-methyltransferase enzyme assays 

[0166] Protein extracts were prepared and COMT and CCoAOMT activity were 

measured as described previously (Inoue et al., 1998). Total protein content was measured 
using the Pierce BCA assay using bovine serum albumin as a standard. 

Ethylene measurements 

[0167] Rosettes of 4 week old plants were harvested, weighed, and incubated under 

ambient laboratory lighting in a sealed 5 mL scintillation vial for 90 min. Ethylene content of 
the gas phase was measured as described previously (Jones and Woodson, 1999). 
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Radiotracer feedine experiments 

[0168] U- 14 C-L-phenylalanine was administered to individual illuminated leaves of 

wild-type and mutant plants for 3h before extraction in methanol at 60°C in the presence of 
unlabeled hydroxycinnamic acids, as described previously (Chappie et al., 1992). Extracts 
were analyzed directly or after saponification for 30 min. in 1 M NaOH followed by 
acidification and extraction of the radiolabeled products into diethyl ether. Extract 
components were resolved by a two-dimensional silica gel TLC system that separates all of 
the natural hydroxycinnamic acids (solvent 1, petroleum ether/ethyl acetate/methanol/acetic 
acid 10:10:1:0.2; solvent 2, toluene/acetic acid/water 2:1 :sat). Incorporation of label into 
pathway intermediates was determined using a Packard Instant Imager. To measure the 
incorporation of label from U- 14 C-L-phenylalanine specifically into caffeic acid, an identical 
feeding experiment was conducted; however, prior to analysis, the extract was incubated in 
methanolic HC1 (80°C, lh) to convert caffeic acid and its ester conjugates to caffeic acid 
methylester. Methylcaffeate was purified by semi-preparative silica gel TLC using the two 
solvents described above. Following each round of TLC, the band of methylcaffeate was 
identified under UV light, scraped from the TLC plate, eluted in methanol, and after the 
second round of TLC, analyzed by reversed phase HPLC (solvent A, 5% acetic acid in water; 
solvent B, 20% acetic acid, 25% acetonitrile in water; 5 to 55% B in 20 min, 55 to 100% B in 
10 min, 100% B for 5 min; flow rate 1 mL min* 1 ) using diode array UV detection. One mL 
fractions were collected and analyzed for radioactivity by liquid scintillation counting. 
Map-based cloning 

[0169] The re/S mutant (Columbia background) was used as the male parent in a 

cross to the Landsberg erecta ecotype. Fi individuals were allowed to self-pollinate, and F2 
plants were screened for the re/8 phenotype. Because re/8 plants are small in stature and are 
female sterile, seeds from phenotypically wild-type plants (REF8/ref8 and REF8/REF8 
individuals) were collected and the F3 progeny were scored for segregation of the re/8 
phenotype. DNA was extracted (Doyle and Doyle, 1990) from homozygous wild-type lines 
for ARMS mapping (Schaffher, 1 996) to determine an initial map position for the REF8 
gene. Subsequently, DNA was extracted from additional F2 plants and F3 families for use in 
PCR-based genotyping experiments. Individuals carrying recombinant chromosomes in the 
region of the REF8 locus were used to determine a mapping interval for the gene, and were 
analyzed further. 
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RNA gel blot analysis 

[0170] For the isolation of RNA, plant tissues were harvested, frozen in liquid 

nitrogen, and stored at -70C until ready for extraction. Total RNA was isolated as 
previously described (Goldsbrough and Cullis, 1981). Samples were electrophoretically 
separated, transferred to Hybond N+ membranes (Amersham), hybridized at 65C with a 
DNA probe (DECAprime II system, Ambion) using a CYP98A3 EST ordered from the 
Arabidopsis Biological Resource Center (209A1T7; Genbank accession number N377 15), 
washed, and exposed to film. 
Isolation of the re18 cDNA 

[0171] The re/8 cDNA was isolated by reverse transcriptase-PCR from total RNA 

using the Promega Access RT-PCR system (Madison, WI) using primer 1 (5 f - 
gcaaggatccatgtcgtggtttctaatagcg-3 1 ) and primer 2 (S'-tcaggaattcatttacatatcgtaaggcacg-S 1 ). 
These primers correspond to the 5 f and 3' ends of the open reading frame and introduce a 
BamHl site upstream of the start codon and an EcoRl site downstream of the stop codon, 
respectively for use in subsequent yeast expression studies. Two independent reaction 
products were subcloned and sequenced to identify the mutation in the re/8 allele. 
Heterologous expression 

[0172] To generate the pBOV E. coli P450 expression plasmid, pCWOri+ was first 

digested with HindlLl and Ndel and purified by gel electrophoresis. The overlapping 
complementary primers 3 (5'-tatggctctgttattagcagtttttata-3'), 4 (5 f - 

caggcctataaaaactgctaataacagagcca-3'), 5 (5'-ggcctgcatgccatcatcatcatcatcattag-3 f ), and 6 (5 1 - 
agctctaatgatgatgatgatgatggcatg-3 ! ) were then phosphorylated using polynucleotide kinase and 
ligated into the pCWOri+ vector backbone. The resulting cloning site in the plasmid contains 
the first eight codons of the bovine CYP1 7 gene followed by a Stul site for the blunt-ended 
cloning of PCR-amplified P450 cDNAs, a downstream Sphl site, a sequence coding for a 6x- 
His tag, and a stop codon. 

[0173] Using the CYP98A3 EST as a template for PCR, the pBOV-itEFS plasmid was 

generated by first using primer 7 (5'-gacaatcgccgccgtcgtatcctac-3') and primer 8 (5*- 
catatcgtaaggcacgcgtttgtac-3 f ) to produce a truncated version of the open reading frame that 
lacked the first nine codons of the protein's N-terminal signal peptide. This PCR product was 
subcloned into SVwI-digested pBOV, the orientation of the insert was determined using 
diagnostic restriction digests, and the fidelity of the PCR process was verified by sequencing. 
To generate the pBOV-re/S plasmid, the CYP98A3 EST was used in two separate reactions 
using primer 7 with primer 9 (5 f -cggtgcacaacttgatatcaatttgg-3') and primer 6 with primer 10 
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(5 f -ccaaattgatatcaagttgtgcaccg-3 f ) to introduce the re/8 mutation (underlined in primers 9 and 
10) into each of two overlapping fragments of the cDNA. The PCR products were purified 
by agarose gel electrophoresis, combined in a single PCR reaction, and amplified using only 
primers 7 and 6. The resulting full length product was subcloned into pBS KS-, sequenced, 
and a StyVSphl fragment containing the ref8 mutation was used to replace the corresponding 
portion of pBOV -REF8 to yield pBOV-re/8. 

[0174] The construction of the Saccharomyces cerevisiae strain WAT1 1 s a derivative 

of the W303-B strain (MAT a; adel-\\ A&3-11,-15; /ew2-3,-l 12; urdh-\\ can K \ cyr + ) 
expressing the ATR1 Arabidopsis NADPH-P450 reductase, was previously described (Truan 
et al., 1993; Pompon et al., 1996). For the construction of the YeDP60-7t£F5 expression 
construct, the CYP98A3 EST was used as the template for PCR as described above with 
primer 1 and primer 2. The resulting 1.5 kb PCR product was subcloned, sequenced, and 
ligated into BamKUEcoRI digested pYeDP60 (Urban et al., 1990) to yield the plasmid 
YeDF60-REF8. To generate the plasmid YeDP60-re/S, PCR using the CYP98A3 EST was 
conducted in two separate reactions using primer 1 with primer 9 and primer 2 with primer 10 
to introduce the re/8 mutation into each of two overlapping fragments of the cDNA. The 
PCR products were purified by agarose gel electrophoresis, combined in a single PCR 
reaction, and amplified using only primers 1 and 2. The resulting full length product was 
subcloned into pBS KS-, sequenced and subcloned into YeDP60 as described above. 
Measurement of enzymatic activity in vivo 

[0175] WAT1 1 cells were transformed with pYeDP60, pYeDP60-it£FS, and 

pYeDP60-re/S (Gietz et al., 1992), cultured and then induced with galactose as described 
previously (Urban et al., 1994). For in vivo measurements of enzyme activity, cells were 
grown in media supplemented with 5 mM /?-coumaric acid. At the end of the incubation 
period the medium was extracted with ethylacetate, and analyzed by HPLC as described 
above. SDS-PAGE analysis of heterologously expressed C3H and in vitro C3H assays were 
performed as described previously for measurement of ferulate 5-hydroxylase activity 
(Humphreys et al., 1999). 
Measurement of enzymatic activity in vitro 

[0176] For C3H assays, an NADPH regenerating system consisting of 1 mM NADP + , 

10 mM glucose-6-phosphate, and 1 unit glucose-6-phosphate dehydrogenase was 
pre-incubated at 30C for 5 min to permit the generation of NADPH in the presence of one 
of the putative C3H substrates in a final volume of 400 ^iL of assay buffer. Assays were 
initiated by the addition of 100 \iL of microsomes and were allowed to incubate for 60 min at 
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30C before being terminated by boiling. Assays were clarified by centrifugation for 20 min 
at 13,000 xg, and were analyzed directly by HPLC on a Microsorb-MV C-18 column (Rainin, 
Woburn MA) using a gradient of solvent A (1 .0% acetic acid in water) and solvent B 
(acetonitrile; 0 to 10% B in 5 min, 10 to 25% B in 25 min; flow rate 1 mL min" 1 ). C3H 
reaction products were quantified using UV detection (caffeic acid, 322 nm; caffeyl aldehyde, 
340 nm; caffeic acid methyl ester, 324 nm). Assays conducted using^microsomes isolated 
from yeast transformed with pYeDP60 served as negative controls. The apparent K m for p- 
coumarate methyl ester was determined using triplicate assays analyzed by the Eadie-Hofstee 
method(Cornish-Bowden, 1995). 
GC-MS 

[0177] Analysis of C3H reactions products was performed using a GC-MS system 

(Shimadzu Corp. Kyoto, Japan) composed of an AOC-20 autosampler, a GC-17A gas 
chromatograph, and a GCMS-QP5050A mass spectrometer. A standards solution was 
prepared from methanolic stock solutions of the substrates and products. Aliquots of each 
stock solution were taken in order to obtain a solution having approximately 1000 pmol L" 1 
of each standard. The standards solution and the assay samples were dried on a RC 10.10 
centrifugal evaporator (Jouan, Winchester VA) resuspended in 100 and 50 L of pyridine, 
respectively and derivatized using 10 and 5 L //-methyl-AT-trimethylsilyltrifluoro-acetamide 
(MSTFA), respectively, at 37C for 30 minutes. A one microliter aliquot of each sample 
was injected into the GC-MS with the split ratio set to 1:25. Helium was used as the carrier 
gas with a flow rate of 1 mL min' 1 . The injector temperature was maintained at 220C. Gas 
chromatography was performed using a 30 m DB-5MS column (0.25 mm I.D., 0.25 m film 
thickness) (J&W Scientific, Folsom CA). The column temperature was initially maintained at 
125 C for 2 minutes and then ramped at 15C/min to 250C and held for 4 minutes. The 
interface temperature was held at 280C. After a 4.95 minute solvent cut time, detection was 
performed using a single quadrupole mass filter set to scan from mass-to-charge 50 to 410 in 
0.17 seconds. 

EXAMPLES 

[0178] The present invention is further defined in the following Examples. It should 

be understood that these Examples, while indicating preferred embodiments of the invention, 
are given by way of illustration only. From the above discussion and these Examples, one 
skilled in the art can ascertain the essential characteristics of this invention, and without 
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departing from the spirit and scope thereof, can make various changes and modifications of 
the invention to adapt it to various usages and conditions. 

EXAMPLE 1 

Evaluation of Phenvlpropanoid Metabolism in refB Mutant 
[0179] To identify mutants defective in sinapoylmalate biosynthesis, approximately 

100,000 M2 seedlings (Columbia ecotype) were screened for individuals that exhibited 
altered cotyledon and/or leaf fluorescence when exposed to UV light. This mutant screen 
identified representatives of two known mutations that are affected in sinapate ester 
biosynthesis,^ h 1 and sngl, as well as a number of additional mutant lines that define several 
other loci. Five of these have been phenotypically characterized (Ruegger and Chappie, in 
press), whereas the others have not, largely due to issues associated with vigor and fertility of 
the M2 plants and their progeny. The re/8 mutant was one belonging to this latter group. 
[0180] When observed under UV light, the re/8 mutant is strongly red fluorescent, 

consistent with the absence of sinapoylmalate, and the accentuation of chlorophyll 
fluorescence that accompanies the lack of this UV-absorbing secondary metabolite. Visible 
phenotypes of the re/8 mutant, (a) UV phenotype of the re/8 mutant. Rosette leaves of three 
week old plants were photographed under 365 nm UV light using a yellow barrier filter. The 
blue-green color of the wild-type rosettes is due to the fluorescence of sinapoylmalate. The 
red fluorescence of the re/8 plant is due to chlorophyll fluorescence that is revealed in the 
absence of sinapate ester fluorescence, (b) The developmental phenotype of the re/8 mutant. 
Wild-type and re/8 plants were grown for 6 weeks under a 16h light / 8h dark photoperiod at 
22°C. When F2 seedlings from crosses of re/8 mutants to wild type were examined, the 
mutant phenotype segregated as a recessive, nuclear, single gene mutation (405 REF8I-, 138 
ref8lref8\ 2 = 0.04, P>0J). 

[0181] The profile of soluble secondary metabolites is altered in the ref8 mutant. The 

re/8 mutant looks similar to the fahl-2 mutant when observed under UV light, suggesting that 
the mutation leads to substantial reductions in leaf sinapoylmalate content. Since not all 
phenolic compounds are fluorescent, it was not clear by this visual inspection whether re/8 
leaves were devoid of phenylpropanoid esters, or whether they only accumulated non- 
fluorescent pathway products. To distinguish between these two possibilities, we analyzed 
extracts of three-week-old rosettes by HPLC (Fig. la) HPLC analysis of soluble secondary 
metabolites produced by wild-type and re/8 plants. (Fig. la) Compounds found in wild-type 
and refS leaves were extracted with methanol and analyzed by HPLC. The elution of UV- 
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absorbing compounds was monitored at 320 run. (Fig. lb) Hydroxy cinnamic acids released 
from their ester conjugates by saponification (1M NaOH, 16h, room temperature) of the 
methanolic extract of (a). (Fig. lc) The same analyses as (a) performed on wild-type and 
re/8 seed extracts. Sinmal, sinapoylmalate; Single, sinapoylglucose; Sincho, sinapoylcholine. 
[0182] This analysis confirmed that refS leaves lack sinapoylmalate, and indicated 

that refS mutants instead accumulate an array of other, apparently non- fluorescent, soluble 
secondary metabolites that absorb UV light at 320 nm. As a first step toward the 
identification of these compounds, leaf extracts were saponified and the hydrolysates were 
then characterized by HPLC. These analyses indicated that whereas the predominant 
hydroxycinnamic acid in saponified extracts of wild-type plants was sinapic acid, extracts of 
re/8 contained little sinapic acid and substantial quantities of the cis and trans isomers ofp- 
coumaric acid (Fig. lb). This finding suggests that the UV-absorbent compounds present in 
re/8 leaves are likely to include esters of /?-coumaric acid. 

[0183] In addition to the sinapoylmalate found in leaf tissue, Arabidopsis accumulates 

sinapoylcholine and to a lesser extent, sinapoylglucose in its seeds. To evaluate the impact of 
the refS mutation on seed secondary metabolism, extracts were prepared from mutant and 
wild-type seeds and were analyzed by HPLC (Fig. lc). The most striking difference between 
the two genotypes was the lower overall content of UV-absorbing metabolites in re/8 
samples. Sinapoylglucose and sinapoylcholine were identified in wild-type seed extracts, and 
although minor peaks co-chromatographing with these compounds were observed in re/8 
extracts, a number of other minor novel peaks were also found in the mutant samples (Fig. 
lc). When wild-type and re/8 seed extracts were saponified, the major hydroxycinnamic 
acids recovered in the hydrolysates were identified by HPLC as sinapic and p-coumaric acids, 
respectively, suggesting that novel p-coumaroyl esters are accumulated in re/8 seeds. 
[0184] Because flavonoids constitute another important class of phenylpropanoid 

metabolites, we scored re/8 plants for their ability to accumulate flavonoid derivatives 
including anthocyanins and seed coat condensed tannins. Anthocyanins can readily be 
observed as red-colored pigments accumulating in leaf tissue of the mutant, and in the 
epidermis of the lower portions of the rachis. Similarly, the seeds of the re/8 mutant are 
brown, indicating that the mutation does not disrupt the flavonoid pathway in the testa. Taken 
together, these phenotypes indicate that the steps of phenylpropanoid pathway from 
phenylalanine to /?-coumaroyl CoA and flavonoid biosynthesis remain unaffected in the re/8 
mutant, and suggest that re/8 plants are defective in one of the steps required for the 
conversion of /7-coumarate to sinapate. 
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[0185] To investigate where phenylpropanoid metabolism is blocked in the re/8 

mutant, we evaluated the fate of 14 C-L-phenylalanine administered to wild-type and re/8 
leaves via the transpiration stream. These experiments revealed that radiolabeled pools of 
cinnamic, /?-coumaric, ferulic and sinapic acids can be detected in the extracts of wild-type 
plants. In extracts of the re/8 mutant, radiolabel was detected in cinnamic acid and p- 
coumaric acid, but not in any subsequent metabolites. In neither wild type nor re/8 was 
radiolabel found to be associated with caffeic acid or 5-hydroxyferulic acid, possibly 
indicating that the pools of these intermediates are below the detectable limits of this method. 
Since hydroxycinnamic acids are often found in esterified forms in plants, and hydroxy- 
cinnamoyl CoA thioesters are thought to comprise an important group of molecules in plant 
metabolism, samples of the previous extracts were saponified and again separated by two- 
dimensional thin layer chromatography (TLC). These analyses provided qualitatively similar 
data: the re/8 extract hydrolysates contained no radiolabeled ferulic acid, but wild-type levels 
of labeled />-coumaric acid. 

[0186] The inability of the ref8 mutant to convert 14 C -phenylalanine into ferulic acid 

is consistent with the hypothesis that the mutant is blocked in either the hydroxylation of p- 
coumaric acid, the O-methylation of caffeic acid to ferulic acid, or the analogous reactions 
occurring at the level of the corresponding CoA thioesters. To distinguish between these 
possibilities, we compared the levels of caffeic acid/5 -hydroxy ferulic acid 0-methyl 
transferase (COMT) and caffeoyl CoA O-methyltransferase (CCoAOMT) in wild-type and 
ref8 stem extracts. These experiments revealed that the two O-methyltransferase activities 
were present at near wild-type levels in reJ8 (Table 2). Although these data suggested that 0- 
methylation of caffeic acid and caffeoyl-Co A are not affected in the re/8 mutant, we wanted 
to determine whether the supply of the co-substrate of the O-methyltransferase reaction, S- 
adenosylmethionine (SAM), might be limiting phenylpropanoid metabolism in the re/8 
mutant. As an indirect measure of SAM levels, we quantified the production of ethylene in 
wild-type and re/8 plants (Table 2). These measurements indicated that re/8 rosettes are 
competent to synthesize ethylene, and that a block in SAM biosynthesis is not likely to be the 
cause of the perturbation in phenylpropanoid metabolism in the mutant. 
[0187] To directly evaluate the ability of the mutant to hydroxylate p-coumaric acid 

and/or /?-coumaroyl-CoA we performed a modified version of the previous radiotracer 
feeding experiments, again administering 14 C-L-phenylalanine to excised wild-type and re/8 
leaves. In this experiment, the leaf extracts were treated with methanolic HC1 to convert 
hydroxycinnamic acid esters and thioesters as well as free hydroxycinnamic acids to their 
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corresponding methyl esters, while simultaneously preventing the destruction of alkali-labile 
dihydroxy-substituted compounds such as caffeic acid. Following two successive rounds of 
preparative TLC, we analyzed the semi-purified methylcaffeate by HPLC and liquid 
scintillation counting. Radiotracer feeding experiment designed to measure incorporation of 
label from 14 C-L-phenylalanine specifically into caffeic acid, analyzed as its methylester. The 
continuous line represents the UV-absorption of the HPLC column eluate. The histogram 
represents radioactivity associated with each fraction (Ca, caffeic acid; pC, /?-coumaric acid; 
Me-Ca, methylcaffeic acid; Me-/?C, methyl-p-coumaric acid). Using UV detection, we 
readily identified the methylcaffeate derived from the internal standard of unlabeled caffeic 
acid that had been added to the wild-type and re/8 samples at the time of extraction. In 
contrast, whereas radioactivity co-chromatographing with methylcaffeate was readily 
detected in the fractions collected from the wild-type samples, no radioactivity was detected 
in these fractions when this procedure was repeated on the re/8 samples. These experiments, 
as well as the phenotypic characterization of the re/8 mutant (Franke et al., submitted), 
provided extremely strong evidence that the REF8 gene encodes a protein required for the 
activity or expression of C3H. 





ethylene production 
(ppb mg f.w." 1 min" 1 ± std. dev.) 


wild type 


11.3 ±2.7 


refS 


6.3 ± 1.3 



Table 2. Ethylene production in wild-type and refS Arabidopsis. Rosettes of three-week-old 
plants were closed in a glass vial and ethylene content in the headspace gas was determined 
after three hours by gas chromatography. The data represent the means of three independent 
measurements ± standard deviation. 

EXAMPLE 2 

Evaluation of Cell- Wall Bound Polysaccharides from ref8 Mutant 
[0188] Plant cell walls frequently contain hydroxy cinnamic acids that are esterified to 

cell wall polysaccharides. To evaluate whether the re/8 mutation has an impact on cell wall- 
bound phenylpropanoids, wild-type and re/8 rachis cell wall preparations were subjected to 
alkaline hydrolysis to release esterified phenolics. HPLC analysis of cell wall hydroly sates 
with UV detection at 320 nm revealed that wild-type plants deposit ester-bound forms of p- 
coumaric and ferulic acid in their cell walls (Fig. 4). These extracts also contained 
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substantial amounts of substances that co-chromatograph with vanillin and syringaldehyde, 
presumably arising from guaiacyl and syringyl lignin subunits, respectively. Although the 
c^-isomer of />-coumaric acid co-chromatographs with syringaldehyde in this system, similar 
experiments conducted on fahl cell walls which lack syringyl lignin and thus do not generate 
syringaldehyde revealed that only small amounts of />-coumarate are actually ester-linked to 
Arabidopsis cell walls. In contrast, analysis of reJS extracts at 320 nm indicated that the cis- 
and /ra/w-isomers of p-coumaric acid dominated the chromatographic profile, and that only 
small quantities of compounds co-chromatographing with ferulic acid isomers were present. 
Since the ability to absorb long wave UV light is restricted to compounds in which the double 
bond system of the phenolic ring extends into the propene side chain, the same samples were 
re-analyzed using a wavelength of 275 nm to test for the presence of other classes of phenolic 
compounds (Fig. 4). This detection method revealed the presence of a compound that co- 
chromatographs with /?-hydroxybenzaldehyde in the cell wall hydrolysates of the re/8 mutant. 
This finding suggests that re/8 cell walls may contain novel esters of p- 
hydroxybenzaldehyde. Alternatively, it may indicate that the lignin in the re/8 mutant 
contains subunits derived from />-hydroxycinnamyl alcohol in addition to, or instead of, 
subunits derived from coniferyl and/or sinapyl alcohols (see below). This novel lignin might 
release /?-hydroxybenzaldehyde under the conditions used in this experiment. 
[0189] From the re/8 mutant phenotypes described above, it can be expected that 

down-regulation of C3H polypeptide activity in other plants would lead to similar changes in 
cell wall biochemical characteristics. Considering that the presence of cell wall esterified 
phenolics have a dramatic impact on cell wall characteristics such as the digestibility of plant- 
derived products used as animal feed, it can be anticipated that such changes in C3H 
polypeptide activity will add value to plants used in agriculture and forestry. It is well known 
in the art that down-regulation of C3H polypeptide activity could be achieved by many 
different mechanisms, including, but not limited to stable transformation with antisense 
suppression constructs, stable transformation with sense suppression constructs, or virus 
induced gene silencing. 

EXAMPLE 3 

The ref8 Mutation Affects Lignin Biosynthesis Qualitatively 
[0190] We have previously found that some Arabidopsis mutants defective in 

sinapate ester metabolism also exhibit alterations in lignin biosynthesis. To examine the 
impact of the ref8 mutation on lignin content, we analyzed cell wall material using 
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thioglycolic acid (TGA) derivatization (Campbell and Ellis, 1992), and the Klason method 
(Kaar et aL, 1991) (Table 3). These analyses indicated that the lignin content of re/8 stems 
was reduced to 20 to 40% of wild-type levels, depending upon the method employed. 
[0191] From the decrease in lignin content seen in the re/8 mutant, it can be expected 

that down-regulation of C3H polypeptide activity in other plants would lead to similar 
decreases in lignin deposition. Considering that lignin content has is an important 
determinant in the use of lignocellulosic plant materials for purposes such as pulp and paper 
production, and influences the nutritional quality of forages fed to animals, it can be 
anticipated that such changes in C3H polypeptide activity will add value to plants used in 
agriculture and forestry. It is well known in the art that down-regulation of C3H polypeptide 
activity could be achieved by many different mechanisms, including, but not limited to stable 
transformation with antisense suppression constructs, stable transformation with sense 
suppression constructs, or virus induced gene silencing. 

EXAMPLE 4 

The refS Mutation Affects Lignin Biosynthesis Qualitatively 
[0192] To examine the type of lignin deposited in the refS mutant, rachis cell wall 

preparations were analyzed by nitrobenzene oxidation (NBO), the DFRC (derivatization 
followed by reductive cleavage) method (Lu and Ralph, 1997), and by pyrolysis GC-mass 
spectrometry (Py-GC-MS) (Bocchini et al., 1997) (Fig. 5). NBO releases substituted 
benzaldehyde and benzoic acid derivatives from lignin and the relative content of these 
monomers is indicative of the presence of /?-hydroxyphenyl (p-hydroxybenzaldehyde and p- 
hydroxybenzoic acid) guaiacyl (vanillin and vanillic acid) and syringyl (syringaldehyde and 
syringic acid) units in the polymer. The expected products representing a typical angiosperm 
guaiacyl-syringyl co-polymer were identified following HPLC analysis of the NBO products 
from wild-type cell walls (Fig. 5). In contrast, NBO of refS cell walls yielded only trace 
amounts of a compound that co-chromatographed with vanillin, and a substance with the 
retention time of /?-hydroxybenzaldehyde was instead the major NBO product. Lignin 
analysis by the DFRC method generates acetylated hydroxcinnamyl alcohols from beta-O-4 
etherified lignin subunits. When DFRC products from wild-type cell wall preparations were 
analyzed by GC-MS using selective ion monitoring, coniferyl and sinapyl alcohol diacetates 
were identified by their retention times and EIMS fragmentation patterns as the major lignin 
degradation products. Trace amounts of /?-coumaryl alcohol diacetate were also observed, 
although the low abundance of the compound precluded its quantitation. In contrast, p- 



54 



coumaryl alcohol diacetate was the major DFRC product in re/8 samples. Both coniferyl and 
sinapyl alcohol diacetates were observed, but were below quantifiable limits. Consistent with 
the TGA and Klason data (Table 2), comparison of DFRC yields on a per g dry weight basis 
indicated that ref8 cell walls contain less total lignin. It should be noted, however, that these 
values are not directly comparable since DFRC yields from re/8 lignin were substantially 
lower than wild type when compared based upon Klason lignin content (Table 3). 
[0193] Analysis of the wild-type and mutant lignin by Py-GC-MS confirmed the 

results of the previous two methods. An array of pyrolysis degradation products expected for 
a mixed guaiacyl/syringyl copolymer were identified in wild-type samples which also 
contained small amounts of p-hydroxyphenyl lignin-derived products. In contrast, pyrolysis 
yielded only p-hydroxy products from rej8 samples and guaiacyl and syringyl substituted 
products were below detectable limits. These data indicate that, in addition to interfering 
with the biosynthesis of hydroxycinnamic acids, the re/S mutation affects the production of 
the monolignols from which wild-type lignin is polymerized. 

[0194] The changes in lignin monomer composition in the re/8 mutant makes it clear 

that it can be expected that down-regulation of REF8 activity in other plants would lead to 
similar changes in lignin biochemistry. Since lignin monomer composition influences 
pulping efficiency as well as other agronomic characteristics of plants, it can again be 
anticipated that similar changes in C3H polypeptide activity will add value to plants used in 
agriculture and forestry. It is well known in the art that down-regulation of C3H polypeptide 
activity could be achieved by many different mechanisms, including, but not limited to stable 
transformation with antisense suppression constructs, stable transformation with sense 
suppression constructs, or virus induced gene silencing. 

EXAMPLE 5 

Evaluation of the Physical and Chemical Resistance of ref8 Cell Walls 
[0195] The experiments described above in the previous examples demonstrated that 

re/8 plants deposit less lignin than the wild type, and that the lignin of re/8 plants is 
synthesized from monomers that are normally at best only very minor components of the 
wild-type polymer. To determine whether these changes in lignin quality and quantity had a 
broader impact on cell wall characteristics, we measured the resistance of wild-type and ref8 
cell wall preparations to the activity of polysaccharide hydrolases. Over half of the original 
cell wall mass remained after digestion of wild-type cell walls, reflecting the resistance to 
enzymatic degradation of crystalline cellulose and lignin. Treatment of wall preparations 
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from the syringyl lignin-deficient fahl mutant gave similar results, indicating that 
perturbations in cell wall chemistry do not necessarily result in altered cell wall degradability. 
In contrast, the refS cell walls exhibited increased susceptibility to enzymatic digestion; the 
residue remaining after digestion was only one fifth of that found with wild-type cell walls. 
[0196] The changes in cell wall degradability in the refS mutant clearly demonstrate 

the potential value of down-regulation of C3H polypeptide activity in plants important to 
agriculture and forestry. It is clear that similar changes in cell wall characteristics would lead 
to improved utilization of lignocellulosic material in terms of pulp and paper production, and 
in agricultural process including, but not limited to, the use of forages for animal feedstocks, 
and the production of other downstream products such as ethanol produced through 
fermentation processes. 





TGA lignin content 
(A280 nig cell wall" 1 ± std. dev.) 


wild type 


3.12 ±0.21 


refS 


0.57 ±0.23 



Table 3. Impact of the re/8 mutation on rachis lignin content as measured by the TGA 
method. The data represent the means of three independent measurements ± standard 
deviation. 

EXAMPLE 6 
Isolation of REF8 Gene 
[0197] To isolate the REF8 gene using positional cloning, we took advantage of the 

advanced state of the Arabidopsis genome sequencing effort. Using a mapping population of 
535 F 2 plants derived from a ref8/ref8 (Columbia background) x REF8/REF8 (Landsberg 
erecta) cross, the position of the REF8 gene was initially determined to be between markers 
nga!68 and T8M12. Thirty nine plants were found to carry chromosomes that were 
recombinant within this region, and these individuals were studied further to determine a 
smaller mapping interval for the REF8 gene. These studies showed REF8 to lie between 
markers T7D17 and SGCSNP169, a region defined by a contig of approximately 10 BACs. 
The annotations of these clones were inspected for genes encoding putative oxidases and 
hydroxylases at http://www.mips.biochem.mpg.de/proj/thal. Two P450s and one peroxidase 
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were identified within this region that we considered to be candidates for REF8 (Fig 3). 
Based upon the relative position of these genes within the mapping interval, and the 
recombination distances determined for the flanking upper and lower markers (2 
recombination events between marker T7D17 and REF8; 10 recombination events between 
REF8 and marker SGCSNP169), we identified a gene on BAG T20B5 encoding a putative 
P450 belonging to the CYP98 class (CYP98A3) as the most likely candidate for REF8. To 
provide a preliminary indication whether CYP98A3 was likely to correspond to REF8, we 
evaluated the tissue specificity of its expression using RNA gel blot hybridization. 

EXAMPLE 7 
Characterization of C3H Polypeptide 
[0198] The experiments described above strongly suggested that the Arabidopsis 

REF8 gene corresponds to T20B5.9, a gene also annotated as CYP98A3, a putative P450. To 
provide further supporting evidence for this hypothesis, we sequenced the putative re/8 
cDNA. The mutant sequence contained a single G to A nucleotide substitution that leads to a 
non-conservative amino acid substitution near the conserved heme-binding region of the 
protein (Fig. 7). To evaluate whether the G to A nucleotide substitution found in the mutant 
gene impairs enzymatic function, we expressed the wild-type and mutant genes in E. coli and 
yeast. For expression of the protein in E. coli, we used the pBOV vector, a version of 
pCWOri+ modified for the high-level expression of eukaryotic P450s (Barnes et al., 1991). 
In this vector, the coding sequence of a portion of the N-terminal domain of the eukaryotic 
P450 is replaced by the first eight codons of the bovine P450 CYP17. For the yeast 
experiments, we expressed the native CYP98A3 protein using the vector YeDP60 (Urban et 
al., 1990). The CYP98A3 expression constructs were then transformed into WAT1 1 yeast in 
which the endogenous yeast P450 reductase gene has been replaced with the ATR1 
Arabidopsis P450 reductase gene under the control of the yeast GAL10-CYC1 promoter to 
provide the expressed P450 with high levels of its normal reductase partner (Pompon et al., 
1996). 

[0199] First, membrane preparations were isolated for spectroscopic and SDS-PAGE 

analysis from E. coli carrying either the pBOV control vector, the REF8 expression vector 
pBOV-REF8, or the pBOV-ref8 construct carrying the mutant gene sequence. SDS-PAGE 
analysis indicated that membranes from bacteria carrying pBOV-REF8 and pBOV-re/S 
contained an abundant protein with a molecular mass of approximately 58 kDa, consistent 
with the expected mass of 57,926 Da for the inferred translation product of the putative C3H 
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cDNA (Fig. 8a). This protein was absent in samples prepared from bacteria carrying the 
control pBOV vector. As expected, carbon monoxide difference spectroscopy indicated that 
samples prepared from control bacteria contained no spectrally active P450. Spectroscopic 
examination of membranes prepared from bacteria expressing the putative C3H protein 
revealed a 450 nm absorbance peak characteristic of P450s (Fig. 8b). In contrast, the CO 
difference spectrum of membranes from bacteria transformed with the pBOV-re/8 construct 
gave a spectrum dominated by a peak at 420 nm, indicating that the protein is capable of 
binding the heme prosthetic group found in P450s, but is mis-folded or structurally altered in 
such a way that it is likely to be inactive. 

[0200] Next, microsomal preparations prepared from yeast transformed with 

YeDP60, Y0DP6O-REF8 or YeDP60-rq/S were similarly analyzed by SDS-PAGE and CO 
difference spectroscopy. Prior to these analysis, membrane preparations were first 
fractionated using a Triton X-l 14 phase partition procedure to enrich the sample in integral 
membrane proteins (Werck-Reichhart et al., 1991). When analyzed by SDS-PAGE (Fig. 8c), 
the Triton phase prepared from membranes of control yeast contained a number of bands, 
whereas similar preparations from yeast expressing the putative wild-type C3H protein 
contained an additional protein with a molecular mass of approximately 58 kDa. In contrast, 
mutant protein does not accumulate in yeast carrying the YeDP60-r^/5 vector (Fig. 8c). As 
expected, carbon monoxide difference spectroscopy indicated that samples prepared from 
control cells contained essentially no spectrally active P450; whereas, under the same 
conditions membranes prepared from yeast expressing the wild type version of the putative 
C3H protein exhibited a strong 450 nm absorbance peak (Fig. 8d). Taken together, these data 
suggest that the putative C3H protein is probably targeted to the endoplasmic reticulum in 
yeast cells, where it should be catalytically active. In contrast, the mutant protein does not 
accumulate, possibly due to enhanced degradation arising from mis-folding like that 
previously seen when the mutant protein was expressed in E. coli. 
[0201] Finally, WAT1 1 yeast carrying the control vector and the YeDY6Q-REF8 

vector were assayed for C3H polypeptide activity in vivo by adding /?-coumarate directly to 
the medium of galactose-induced yeast cultures. This method has previously been used to 
demonstrate the activity of C4H and F5H heterologously expressed in yeast (Pierrel et al., 
1994; Humphreys et al., 1999), and exploits the ability of simple hydroxycinnamic acids to 
readily cross yeast membranes. When p-coumarate was added to the medium of control 
yeast, it was the predominant UV-absorbent substance present in the medium after several 
hours of incubation. In contrast, when a parallel experiment was performed with yeast 
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harboring the YeDP60-ilEF5 expression vector, a novel peak was found whose retention time 
and UV-spectrum matched precisely those of caffeic acid (Fig. 6). Replicate experiments 
analyzed by GC-MS with selective ion monitoring permitted unequivocal identification of the 
C3H reaction product. In the context of the phenotypic characterization of the ref8 mutant 
(Franke et al., submitted), these data provide definitive proof that CYP98A3 encodes C3H 
polypeptide, and that C3H polypeptide is a P450. 

EXAMPLE 8 
Other Substrates of C3H Polypeptide 
[0202] Although the in vivo assays of yeast carrying the YeDP60-i£EFS vector 

demonstrated that C3H polypeptide is capable of hydroxylating />-coumarate, it was not 
possible to use this approach to determine kinetic constants for the enzyme, nor to use this 
system to assay the activity of C3H toward substrates that cannot readily cross the yeast 
plasma membrane. To experimentally address these issues, C3H-containing microsomes were 
prepared for use in in vitro assays of enzymatic activity. Consistent with the in vivo results, 
incubation of C3H in the presence of /?-coumarate in vitro resulted in the production of 
caffeic acid (Fig. 9), although this activity was so low that it precluded detailed kinetic 
analysis. In addition to /7-coumarate, several other compounds have been suggested to be 
substrates for the 3 -hydroxy lase(s) of phenylpropanoid metabolism (Heller and Ktihnl, 1985; 
Kiihnl et al, 1987; Kneusel et al., 1989; Tanaka and Kojima,1991). Because the assays using 
/>-coumarate suggested that it may not be the optimal substrate for C3H, we assayed the 
activity of the enzyme against an array of other possible substrates. In these experiments, no 
activity of C3H polypeptide toward p-coumaroyl CoA, />-hydroxycinnamyl alcohol, and 1- 
(9-/7-coumaroyl~D-glucose was detected. Levels of activity comparable to those seen with p- 
coumarate were seen when />-coumaraldehyde was used as a substrate. In contrast, much 
higher levels of activity were seen when /?-coumaroyl methyl ester was used as a substrate 
(Fig. 7a), although the apparent K m for this substrate (2.5 ±0.1 raM; Fig. 9) was still higher 
than those of other phenylpropanoid pathway P450s. No activity was seen when cinnamate, 
caffeate or ferulate was used as a substrate for C3H. 
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EXAMPLE 9 
Increase in Flavonoid Content and 
Resistance to UV Light 
[0203] A plant in which C3H polypeptide activity is decreased will, as a result, have 

increased flavonoid content. Since C3H polypeptide functions in a biochemical pathway that 
diverts phenylpropanoid pathway intermediates away from flavonoid biosynthesis, it would 
be apparent to one skilled in the art that a plant in which C3H polypeptide activity was down- 
regulated using methods including, but not limited to stable transformation with antisense 
suppression constructs, stable transformation with sense suppression constructs, or virus 
induced gene silencing, would have increased levels of flavonoids. Such a plant would be of 
value because it is well known in the art that flavonoids are important in the resistance of 
plant to UV light (Li, et al., 1993). Thus, it can be expected that plant in which C3H 
polypeptide activity is downregulated would have, among other desirable characteristics, 
enhanced resistance to UV light. 

EXAMPLE 10 
Increase in Isoflavonoid Content and Nutritional Value 
[0204] A plant in which C3H polypeptide activity is decreased will, as a result, have 

increased isoflavonoid content. Since REF8 functions in a biochemical pathway that diverts 
phenylpropanoid pathway intermediates away from isoflavonoid biosynthesis, it would be 
apparent to one skilled in the art that a plant in which REF8 activity was down-regulated 
using methods including, but not limited to stable transformation with antisense suppression 
constructs, stable transformation with sense suppression constructs, or virus induced gene 
silencing, would have increased levels of isoflavonoids. Such a plant would be of value 
because it well known in the art that isoflavonoids are important nutrients in the human diet 
and have value as phytoestrogens (Humphreys, et al., 2000). Thus, it can be expected that 
plant in which REF8 activity is downregulated would have desirable characteristics including 
but not limited to enhanced levels of isoflavonoids and improved nutritional value. 

EXAMPLE 11 
Increase in Anthocvanin Content 
[0205] Since C3H polypeptide functions in a biochemical pathway that diverts 

phenylpropanoid pathway intermediates away from anthocyanin biosynthesis, it would be 
apparent to one skilled in the art that a plant in which C3H polypeptide activity was down- 
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regulated using methods including, but not limited to stable transformation with antisense 
suppression constructs, stable transformation with sense suppression constructs, or virus 
induced gene silencing, would have increased levels of anthocyanins. Such a plant would be 
of value because it well known in the art that anthocyanins are important flower pigments, 
and that metabolic engineering of flower color is of significant value in the cut flower 
industry, and in the production of horticulturally desirable plants in general. (Holton, et al., 
1993). Thus, it can be expected that plant in which REF8 activity is downregulated would 
have desirable characteristics including but not limited to enhanced levels of anthocyanins. 

EXAMPLE 12 
Decrease in Cell Well Bound Conjugates 
[0206] A plant in which C3H polypeptide activity is decreased will, as a result, have 

increased anthocyanin content. Since C3H polypeptide functions in a biochemical pathway 
that is required for the synthesis of phenolic compounds commonly found chemically linked 
to plant cell walls, it would be apparent to one skilled in the art that a plant in which C3H 
polypeptide activity was down-regulated using methods including, but not limited to stable 
transformation with antisense suppression constructs, stable transformation with sense 
suppression constructs, or virus induced gene silencing, would have decreased levels of cell 
wall-bound conjugates including but not limited to ferulic and sinapic acids. Such a plant 
would be of value because, among other things, it well known in the art that cell wall bound 
phenolics decrease the nutritional value of plants used as animal feedstocks (Jung, H.G. and 
Deetz, D.A. (1993) Cell wall lignification and degradability in Forage Cell Wall Structure 
and Digestibility (HG Jung, DR Buxton, RD Hatfield, J Ralph eds.), ASA/CSSA/SSSA Press, 
Madison, WI). Thus, it can be expected that plant in which REF8 activity is downregulated 
would have desirable characteristics including but not limited to enhanced digestibility. 
[0207] While the invention has been disclosed in this patent application by 

reference to the details of preferred embodiments of the invention, it is to be understood that 
the disclosure is intended in an illustrative rather than in a limiting sense, as it is 
contemplated that modifications will readily occur to those skilled in the art, within the spirit 
of the invention and the scope of the appended claims. 
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